Wikidata:Request a query/Archive/2018/03

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

P40 without adopted children and their offspring?

SELECT * { wd:Q658061 wdt:P40+ ?item }
Try it!

Any idea how I could get children and grandchildren without adopted children and their descendants, e.g. Q63014#P40.
--- Jura 06:06, 27 February 2018 (UTC)

This question of trying to do a path query, but with the links subject to a qualifier test, is similar to the question I had above, in #Virtual_graph_?. There is in general no way to write that kind of test into the path statement.
You can get a result which is close to what you are looking for here with
SELECT * { 
   wd:Q658061 wdt:P40+ ?item.
   MINUS {
     wd:Q658061 wdt:P40* ?adoptee_parent.
     ?adoptee_parent p:P40 ?stmt .
     ?stmt pq:P1039 wd:Q25858158 .
     ?stmt ps:P40 ?adoptee .
     ?adoptee wdt:P40* ?item .
  }
}
Try it!
But this does exclude descendants who have multiple lines back to the top of the tree, if any one of those lines was adoptive. (That case, of multiple lines, could perhaps also be queried for).
Of course, if you're only interested in looking down to a finite depth, eg grandchildren, in that case you could make an explicit join for each generation, including an adoptiveness test, rather than using the path query. Jheald (talk) 11:06, 27 February 2018 (UTC)
Thanks. Actually, I started out with Sophia of Hanover (Q57224) and tried to do infinite depth. It's likely that there would be multiple lines back. Normally any living, non-Catholic, non-adoptee descendants are on the line of succession. Maybe gas:service needs a more detailed option than gas:linkType wdt:P40 .
--- Jura 11:16, 27 February 2018 (UTC)
An in-house addition to the existing GAS programs that allowed the presence or absence of a particular triple (or triples?) to be tested for on each link would also solve my #Virtual_graph_? tube line example above. The question I suppose is how often this kind of issue comes up, to make it worth the exploration. But perhaps we could ask User:Smalyshev (WMF) whether he thinks a GAS subroutine modified in this way could be feasible? Jheald (talk) 13:26, 27 February 2018 (UTC)
Maybe GAS can already do it. If not, we might need to scale back on the use of qualifiers.
--- Jura 17:03, 27 February 2018 (UTC)
So if I understand right what you're asking for, you're asking for BFS implementation that can apply certain conditions to nodes as it is traversing. This looks like something that GAS should be able to do, however I do not think any existing GAS program currently does that. I think it is possible to implement one, though not sure what the performance would be. If we just check against the index (i.e. just one hop for each vertex) then it is just index lookup and might be OK. More complex joins might be more problematic. In any case, this would probably require developing a new GAS class. So if you think it is helpful, please submit Phabricator task with examples of problems it would solve. Smalyshev (WMF) (talk) 01:00, 1 March 2018 (UTC)

Select by qualifier value

I need a list of items with P528 where its qualifier P972 = Q20102343. It must show the Q, label/page (for CA), P625, the selected P528 and its qualifiers P585 and P973. Thanks, Amadalvarez (talk) 06:53, 1 March 2018 (UTC)

Sounds like this:
SELECT ?item ?itemLabel ?coordinates ?catalogCode ?pointInTime ?describedAtUrl WHERE {
  ?item p:P528 [ ps:P528 ?catalogCode; pq:P972 wd:Q20102343; pq:P585 ?pointInTime; pq:P973 ?describedAtUrl ] .
  OPTIONAL { ?item wdt:P625 ?coordinates }
  SERVICE wikibase:label { bd:serviceParam wikibase:language 'ca,[AUTO_LANGUAGE]' }
}
Try it!
MisterSynergy (talk) 07:54, 1 March 2018 (UTC)
@ MisterSynergy:. Thanks a lot. Easier that I imagine !. However, why it does'nt show Q11236 ? the only difference I can see is that P972 is not the first qualifier. Amadalvarez (talk) 09:59, 1 March 2018 (UTC)
@Amadalvarez: For Biure (Q11236) there is no described at URL (P973) for the catalog code (P528), see this query
SELECT ?item ?itemLabel ?coordinates ?catalogCode ?pointInTime ?describedAtUrl WHERE {
  ?item p:P528 ?p528stm .
  ?p528stm ps:P528 ?catalogCode;
           pq:P972 wd:Q20102343.
  OPTIONAL { ?p528stm pq:P585 ?pointInTime }
  OPTIONAL { ?p528stm pq:P973 ?describedAtUrl }
  OPTIONAL { ?item wdt:P625 ?coordinates }
  SERVICE wikibase:label { bd:serviceParam wikibase:language 'ca,[AUTO_LANGUAGE]' }
}
Try it!
--Larske (talk) 10:34, 1 March 2018 (UTC)
@Larske: This is correct !. The only qualifiers mandatory is P972 because contains the catàlog to select. The others qualifiers are optional. Thanks !Amadalvarez (talk) 11:12, 1 March 2018 (UTC)

Women born BCE

Hi! I am looking for women who are born BCE and have no sitelink to dewiki. Can anybody help? Thank you, Doc Taxon (talk) 07:41, 1 March 2018 (UTC)

SELECT ?item WHERE {
  ?item wdt:P569 ?dob; wdt:P21 wd:Q6581072; wdt:P31 wd:Q5 .
  MINUS { ?item ^schema:about/schema:isPartOf <https://de.wikipedia.org/> }
  FILTER(YEAR(?dob) < 0) .
}
Try it!
MisterSynergy (talk) 08:02, 1 March 2018 (UTC)

More higher

With this query I have the lis of all the chess player with ELO > 2700, The problem is that are reported more rows for item. I need only the most recent higher ELO or the oldest higher ELO is feasible? (regardless of rank.)

SELECT distinct ?item ?itemLabel ?elo ?date WHERE {
  ?item wdt:P1087 ?elo .
  FILTER(?elo > 2700)
  optional {?item p:P1087 ?statement .}
  optional {?statement pq:P585 ?date .}
  
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!

--ValterVB (talk) 18:19, 1 March 2018 (UTC)

Slightly trickier than I thought it was going to be, because I originally thought it was possible to use GROUP BY and then LIMIT 1 to return only one row from each group... but it turns out LIMIT doesn't work that way.
However, I think the following query does what you want:
SELECT distinct ?item ?itemLabel ?elo ?date WITH {
  SELECT ?item ?stmt ?elo ?date WHERE {
     ?item p:P1087 ?stmt .
     ?stmt ps:P1087 ?elo .
     FILTER(?elo > 2700)
     optional {?stmt pq:P585 ?date }
  }
} AS %elo

WHERE {
  {
    SELECT ?item (MAX(?date) AS ?date) WHERE {
       INCLUDE %elo .
    } GROUP BY ?item 
  }
  INCLUDE %elo .   
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
This returns players with ELOs over 2700. If there are multiple ELOs for a player, it returns the most recent one that is over 2700. That might not be the current ELO, if their ELO has since fallen. But I think this is what you were asking for. Jheald (talk) 19:02, 1 March 2018 (UTC)
Change MAX() to MIN() to give the earliest date and score over 2700. Jheald (talk) 19:12, 1 March 2018 (UTC)
Perfect, this is exactly what I need. Thanks --ValterVB (talk) 20:28, 1 March 2018 (UTC)

Items with property quote

How do I get all items with the property Property:P1683? Vishvas vasuki (talk) 16:31, 2 March 2018 (UTC)

Found the solution thanks to https://stackoverflow.com/questions/36523546/wikidata-query-find-items-where-property-exists : Try it here. Vishvas vasuki (talk) 03:43, 3 March 2018 (UTC)

BLfD-ID and no P18

Hello! I am looking for all the entities who have a Bavarian monument authority ID (P4244) of D-5-64-000-\d{1,9999} and have no image (P18). How can I do this? Doc Taxon (talk) 18:07, 3 March 2018 (UTC)

... or Bavarian monument authority ID (P4244) beginning with D-5-64-000-
SELECT distinct ?item ?itemLabel ?BavNo WHERE {
  ?item wdt:P4244 ?BavNo .                     # item has a P4244
  MINUS { ?item wdt:P18 [] } .                 # item has no image
  filter(substr(?BavNo,1,11)="D-5-64-000-")    # P4244 value starts with D-5-64-000-
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],de". }
}
Order by ?BavNo
Try it!
I think this is Bavarian monument authority ID (P4244) beginning with D-5-64-000- and without an image, @Doc Taxon:. --Tagishsimon (talk) 18:58, 3 March 2018 (UTC)

@Tagishsimon: Oh thank you, I want to show optional coordinates, too, but only the value in the parentheses.

SELECT ?BLfD_ID ?item ?itemLabel ?itemDescription ?coords ?address WHERE {
  ?item wdt:P4244 ?BLfD_ID .                      # item has a P4244
  MINUS { ?item wdt:P18 [] } .                    # item has no image
  MINUS { ?item wdt:P373 [] } .                   # item has no Commonscat
  OPTIONAL { ?item wdt:P625 ?coords . }           # show coordinates
  OPTIONAL { ?item wdt:P969 ?address . }          # show address
  FILTER(SUBSTR(?BLfD_ID,1,11)="D-5-64-000-")     # P4244 value starts with D-5-64-000-
  SERVICE wikibase:label { bd:serviceParam wikibase:language "de,[AUTO_LANGUAGE]". }
}
ORDER BY ?BLfD_ID
Try it!
with this I get: Point(11.06342 49.37388)
but I only want to get: 11.06342 49.37388
Doc Taxon (talk) 19:43, 3 March 2018 (UTC)
SELECT ?BLfD_ID ?item ?itemLabel ?itemDescription ?latlon ?address WHERE {
  ?item wdt:P4244 ?BLfD_ID .                      # item has a P4244
  MINUS { ?item wdt:P18 [] } .                    # item has no image
  MINUS { ?item wdt:P373 [] } .                   # item has no Commonscat
  OPTIONAL { ?item p:P625 ?coord .
             ?coord psv:P625 ?coord_node .
             ?coord_node wikibase:geoLongitude ?lon .
             ?coord_node wikibase:geoLatitude ?lat .
             BIND(CONCAT(STR (?lat), " ", STR(?lon)) AS ?latlon).
           }           # show coordinates
  OPTIONAL { ?item wdt:P969 ?address . }          # show address
  FILTER(SUBSTR(?BLfD_ID,1,11)="D-5-64-000-")     # P4244 value starts with D-5-64-000-
  SERVICE wikibase:label { bd:serviceParam wikibase:language "de,[AUTO_LANGUAGE]". }
}
ORDER BY ?BLfD_ID
Try it!

Not the most elegant solution, but it works! :-) Tkarcher (talk) 21:53, 3 March 2018 (UTC)


SELECT ?BLfD_ID ?item ?itemLabel ?itemDescription ?coords ?coord2 ?address WHERE {
  ?item wdt:P4244 ?BLfD_ID .                      # item has a P4244
  MINUS { ?item wdt:P18 [] } .                    # item has no image
  MINUS { ?item wdt:P373 [] } .                   # item has no Commonscat
  OPTIONAL { ?item wdt:P625 ?coords . }           # show coordinates
  OPTIONAL { ?item wdt:P969 ?address . }          # show address
  FILTER(SUBSTR(?BLfD_ID,1,11)="D-5-64-000-")     # P4244 value starts with D-5-64-000-
  BIND(SUBSTR(?coords,7,16) as ?coord2)           # slice the coords you want out of coords
  SERVICE wikibase:label { bd:serviceParam wikibase:language "de,[AUTO_LANGUAGE]". }
}
ORDER BY ?BLfD_ID
Try it!
- This seems to work @Doc Taxon: ... only because all the coords seem to have the same number of DP, and a 2-digit ordinal ... which is to say it's a hack & not the right way to do things. --Tagishsimon (talk) 21:55, 3 March 2018 (UTC)

I noticed a slight difference between the solutions based on wdt and the solution based on p above, the version using p gives three more results rows. It boils down to the object Q41427943 giving four identical result rows whereas the other objects only give one result. In the following query I have removed everything else but the coordinates. The only "structural difference" between the object Q41411330 with one result and Q41427943 with four results that I can see is that the latter also has a description in Indonesian (Q9240). Can anyone explain this?

SELECT ?item ?lat ?lon WHERE {
  VALUES ?item { wd:Q41427943 wd:Q41411330 }
#why does Q41427943 give FOUR results rows but Q41411330 only gives ONE?
#the only? structural difference between these two objects is that Q41427943 has a desccription in 'id' (indonesian)!
  ?item p:P625 ?coord .
  ?coord psv:P625 ?coord_node .
  ?coord_node wikibase:geoLongitude ?lon .
  ?coord_node wikibase:geoLatitude ?lat .  
}
Try it!

--Larske (talk) 04:08, 4 March 2018 (UTC)

It has nothing to do with the indonesian translation (which was wrong anyway, so I deleted it now), but with the Wikidata Data Model: https://stackoverflow.com/questions/49012872/duplicated-results-from-wikidata --Tkarcher (talk) 15:40, 4 March 2018 (UTC)
Thanks for the hint. It seems some coordinate values have multiple values (0 and some other value) for wikibase:geoPrecision.
#some coordinate values seem to have multiple values for wikibase:geoPrecision
SELECT ?item ?coord ?geoprecision WHERE {
  VALUES ?item { wd:Q41427943 wd:Q41411330 }
  ?item p:P625 ?p625 .
  ?p625 ps:P625 ?coord .
  ?p625 psv:P625 [ wikibase:geoPrecision ?geoprecision ]
}
ORDER BY ?item
Try it!
This seems to be very common, here are some 8,800 coordinates for church building (Q16970) in France (Q142) with this "multiple precision problem". Are there any plans to correct this?
SELECT ?item ?coord ?geoprecision1 ?geoprecision2 WHERE {
  ?item wdt:P31 wd:Q16970 .
  ?item wdt:P17 wd:Q142 .
  ?item p:P625 ?p625 .
  ?p625 ps:P625 ?coord .
  ?p625 psv:P625 ?psv625 .
  ?psv625 wikibase:geoPrecision ?geoprecision1 .
  ?psv625 wikibase:geoPrecision ?geoprecision2 .
  FILTER(?geoprecision1<?geoprecision2)
}
ORDER BY ?item
Try it!
I also noticed that this API only returns one value for "precision" for Q41427943 when Wikidata Query reports two as per the query above?
--Larske (talk) 12:14, 5 March 2018 (UTC)
Thanks for the thorough analysis. This does indeed sound like a bug either in the database itself or in the query service. If no one comes up with a good explanation, we should open a phabricator task for it. --Tkarcher (talk) 20:11, 5 March 2018 (UTC)
I created a task for it now: https://phabricator.wikimedia.org/T189086 --Tkarcher (talk) 08:38, 7 March 2018 (UTC)

Wikipedia articles categorized in a Wikiproject

I'm looking for a way to retrieve all items that have a Wikipedia article that was categorized in a given Wikiproject. For example, all items with an article in the English Wikipedia that are in the biophysics wikiproject. Is it possible at all? Thank you. --MarioGom (talk) 11:01, 6 March 2018 (UTC)

Yes, using PetScan - https://petscan.wmflabs.org/?psid=3470173. --Jklamo (talk) 12:14, 6 March 2018 (UTC)

All national wildlife areas of Canada in Quebec

Good day, I would like to get a list of all national wildlife areas of Canada (Q337807) located in the province of Quebec (Q176). My problem is that in the different instances of Q337807, the P131 (located in the administrative territory) is either the city or the local region where the item is located. How can I find all of them that are in a given province? Is it possible to have some sort of recursive P131 until it hits the level of a province? Thanks, Amqui (talk) 17:25, 9 March 2018 (UTC)

@Amqui: This query returns eight items, does it seem reasonable?
#defaultView:Map
SELECT DISTINCT ?item ?itemLabel ?location WHERE
{
  ?item wdt:P31 wd:Q337807;
        wdt:P131* wd:Q176;
        wdt:P625 ?location. 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
--Larske (talk) 18:31, 9 March 2018 (UTC)
@Larske: It works perfectly. I tried it with another province as well and the return is accurate. Thank you very much, Amqui (talk) 20:14, 9 March 2018 (UTC)

Ottoman Sultans and their Grand Viziers

A lot of attention is given to Turkey at this time. I would like a query where in a chronological order you find the Ottoman Empire Sultans and their Grand Viziers. When this works other high ranking officials can be shown in a separate column. Thanks, GerardM (talk) 06:51, 10 March 2018 (UTC)

@GerardM: Something like this?
SELECT ?sultan ?sultanLabel ?vizier ?vizierLabel ?start ?end WHERE {
  {
    ?sultan p:P39 ?sultan_stmt .
    ?sultan_stmt ps:P39 wd:Q15315411 .
    OPTIONAL {?sultan_stmt pq:P580 ?start } .      
    OPTIONAL {?sultan_stmt pq:P582 ?end } .
  } UNION {
    ?vizier p:P39 ?vizier_stmt .
    ?vizier_stmt ps:P39 wd:Q15063567 .
    OPTIONAL {?vizier_stmt pq:P580 ?start } .      
    OPTIONAL {?vizier_stmt pq:P582 ?end } .
  } 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
} ORDER BY ?start
Try it!
Jheald (talk) 09:05, 10 March 2018 (UTC)
@GerardM: Or alternatively:
SELECT ?sultan ?sultanLabel ?sultan_start ?sultan_end ?vizier ?vizierLabel ?vizier_start ?vizier_end WHERE {
 
    ?sultan p:P39 ?sultan_stmt .
    ?sultan_stmt ps:P39 wd:Q15315411 .
    OPTIONAL {?sultan_stmt pq:P580 ?sultan_start } .      
    OPTIONAL {?sultan_stmt pq:P582 ?sultan_end } .

    ?vizier p:P39 ?vizier_stmt .
    ?vizier_stmt ps:P39 wd:Q15063567 .
    OPTIONAL {?vizier_stmt pq:P580 ?vizier_start } .      
    OPTIONAL {?vizier_stmt pq:P582 ?vizier_end } .
  
    FILTER ((?vizier_start <= ?sultan_end) && (?vizier_end >= ?sultan_start))
  
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
} ORDER BY ?sultan_start ?vizier_start
Try it!
Jheald (talk) 11:15, 10 March 2018 (UTC)

format got URL as an external link

I need an improvement of the previous Select_by_qualifier_value query. Regarding the referenced query (above in this discussion), I'd like to use the "describedAtUrl" (when exists) as a link of the "catalogCode" (Ex.: [URL catalogCode]), because when I show result as a map or timeline instead of table I don't need to see the URL but the link. Thanks, Amadalvarez (talk) 18:31, 10 March 2018 (UTC)

WDQS doesn't know what links are, so doesn't recognise them and can't display them as such.
For external use (eg Wikipages, Listeria, etc), it should be easy enough to build the required strings as desired from the different variables you already have using something like BIND(CONCAT(......) AS ?link) Jheald (talk) 19:06, 10 March 2018 (UTC)
So this doesn't work for display within WDQS:
SELECT ?item ?link WHERE {
    ?item wdt:P973 ?url .
    BIND(CONCAT('<a href="', str(?url), '">link</a>') AS ?link) .
} LIMIT 10
Try it!
and nor does this
SELECT ?item ?link WHERE {
    ?item wdt:P973 ?url .
    BIND(CONCAT('[', str(?url), ' link]') AS ?link) .
} LIMIT 10
Try it!
but such forms can sometimes be useful if results are being created for re-display outside WDQS. Jheald (talk) 19:15, 10 March 2018 (UTC)
@Jheald: It doesn't run ... as I expected, because it build correctly the HTML, but it's not clikable as a link when shows in timeline:
#defaultView:Timeline
SELECT ?item ?itemLabel  ?pointInTime ?link ?notariLabel WHERE {
  ?item p:P528 ?p528stm .
  ?p528stm ps:P528 ?catalogCode;
           pq:P972 wd:Q20102343.
  OPTIONAL { ?p528stm pq:P585 ?pointInTime }
  OPTIONAL { ?p528stm pq:P973 ?describedAtUrl }
  OPTIONAL { ?p528stm pq:P664 ?notari }
  OPTIONAL { ?item wdt:P625 ?coord }
   BIND(CONCAT('<a href="', str(?describedAtUrl), '">', ?catalogCode, '</a>') AS ?link) .
  SERVICE wikibase:label { bd:serviceParam wikibase:language 'ca,[AUTO_LANGUAGE]' }
Try it!

Amadalvarez (talk) 09:43, 11 March 2018 (UTC)

@Amadalvarez: As I said, this doesn't work within WDQS. Jheald (talk) 09:47, 11 March 2018 (UTC)
@Jheald: Excuse me, I did not understand. In listeria, the HREF option doesn't work, but the second one (claudators) runs ok. Thanks, Amadalvarez (talk) 10:00, 11 March 2018 (UTC)

Fixing descriptions with a certain error

I am trying to fix an error with Bengali descriptions that was propagated by those using namescript.js, namely the change from the incorrect "পারিবারিক নামাম" to the correct "পারিবারিক নাম". I have this query, which should list those family name items with the faulty description, but it has timed out as written and in several of its previous revisions:

select ?item ?itemLabel ?description where {
  ?item wdt:P31 wd:Q101352.
  ?item schema:description ?description.
  filter(lang(?description) = "bn" && regex(str(?description), "পারিবারিক নামাম.*"))
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],bn,en". }
}
Try it!

Is there a more efficient way of generating this list? Mahir256 (talk) 22:01, 11 March 2018 (UTC)

Do you really need the regex function? Otherwise this would do the job:
select ?item ?itemLabel ?description where {
  ?item wdt:P31 wd:Q101352; schema:description 'পারিবারিক নামাম'@bn .
  SERVICE wikibase:label { bd:serviceParam wikibase:language '[AUTO_LANGUAGE],bn,en' }
}
Try it!
MisterSynergy (talk) 22:32, 11 March 2018 (UTC)
Running string operations on a class with 239,000 members is never going to be fast.
One option would be to run the job and make the fixes as a batch process, extracting and fixing a finite number at once:
select ?item ?label_bn ?label_en ?description where {
  ?item wdt:P31 wd:Q101352.
  ?item schema:description ?description.
  filter(lang(?description) = "bn" && regex(str(?description), "পারিবারিক নামাম.*")) .
  OPTIONAL {?item rdfs:label ?label_bn . FILTER(lang(?label_bn) = "bn")} .
  OPTIONAL {?item rdfs:label ?label_en . FILTER(lang(?label_en) = "en")} .
} LIMIT 1000
Try it!
Having looked at some of the results, it's apparent that User:MisterSynergy's query is not going to be enough, because you need to account for the different values after the Bengali text. I tried STRSTARTS() rather than the regex, but it still timed out:
select ?item ?itemLabel ?description where {
  ?item wdt:P31 wd:Q101352; schema:description ?description.
  FILTER( STRSTARTS(?description, 'পারিবারিক নামাম')) .
  SERVICE wikibase:label { bd:serviceParam wikibase:language '[AUTO_LANGUAGE],bn,en' }
}
Try it!
Indeed, looking at it on the LIMIT 1000 query, the query is actually slower if one omits the filter(lang(?description) = "bn" &&; it's also slower if one omits the STR() cast before STRSTARTS.
The most efficient I think I can therefore do is:
select ?item ?label_bn ?label_en ?description where {
  ?item wdt:P31 wd:Q101352.
  ?item schema:description ?description.
  filter(lang(?description) = "bn" && STRSTARTS(str(?description), "পারিবারিক নামাম")) .
  OPTIONAL {?item rdfs:label ?label_bn . FILTER(lang(?label_bn) = "bn")} .
  OPTIONAL {?item rdfs:label ?label_en . FILTER(lang(?label_en) = "en")} .
} LIMIT 1000
Try it!
Jheald (talk) 22:49, 11 March 2018 (UTC)
Actually, on further runs, sometimes the regex is faster, sometimes the STRSTARTS(). So I haven't really been able to help you at all, apart from the suggestion to batch the queries and fix the data in chunks. Jheald (talk) 23:07, 11 March 2018 (UTC)
@MisterSynergy, Jheald: Thank you both; I guess I'll just fix them in batches, then. Mahir256 (talk) 23:16, 11 March 2018 (UTC)
@MisterSynergy, Jheald: Alternatively, there was PetScan. Mahir256 (talk) 23:45, 11 March 2018 (UTC)
... which can do the label lookup and regex as a post-processing operation, after the basic query. Smart. Jheald (talk) 23:49, 11 March 2018 (UTC)

Can't query some labeled items

Dear all,

I want to check, if a person exists and return its identifier.

This works fine:

SELECT DISTINCT ?item WHERE {
  ?item ?label "Hartmann von Aue".
  ?item wdt:P31 wd:Q5.
}

--> Q75852 So why does that query:

SELECT DISTINCT ?item WHERE {
  ?item ?label "Heinrich von Rugge".
  ?item wdt:P31 wd:Q5.
}

not return any results? It should have found Q1599658.

What am i missing here?

THX  – The preceding unsigned comment was added by OvWolkenstein (talk • contribs).

I'm not sure why your solution is returning any results (for first query), but you can try this one:
select ?item where {
  ?item wdt:P31 wd:Q5; rdfs:label 'Heinrich von Rugge'@en .
}
Try it!
Of course, it will search only English. --Edgars2007 (talk) 10:43, 13 March 2018 (UTC)
The reason why the first query returned any result is that Hartmann von Aue (Q75852) have a Commons category (P373) as well as a Commons gallery (P935) with the value "Hartman von Aue" as can be seen from the slightly modified query below. So the use of "?label" in the query is wrong, it has nothing to do with the actual label for the item. It is just a name of a variable for the property.
SELECT DISTINCT ?item ?label WHERE {
  ?item ?label "Hartmann von Aue".
  ?item wdt:P31 wd:Q5.
}
Try it!
--Larske (talk) 11:57, 13 March 2018 (UTC)

Integer numbers, sorted by (thresholded) number of times they appear as values in statements

Rough outline: I'd like to find all the integers   for which that number appears in at least   statements (e.g.  ) as the value of that statement, ideally in a way that distinguishes between different data types (e.g. numerical and monolingual string).

Here is what I have: I can find items that are integers

The following query uses these:

as well as items that have statements with a specific value (e.g. 1024)

The following query uses these:

SELECT ?item ?property WHERE {
   ?item ?property "1024".
}

but I don't know how to loop over these integers as values in the statements. Any pointers? Thanks. --Daniel Mietchen (talk) 01:29, 14 March 2018 (UTC)

@Daniel Mietchen: Here's an initial pointer which combines your two queries, but doesn't deal with some of the 2nd order desirables.
SELECT DISTINCT ?number ?numberLabel ?item ?property ?itemLabel WHERE {
  ?number wdt:P31/wdt:P279 wd:Q12503 . #find integers
  ?number rdfs:label ?numberLabel . FILTER(lang(?numberLabel) = "en")  #get the integer's en label
  ?item ?property ?numberLabel .  #find items for which the integer's label value is an item property value
  ?item rdfs:label ?itemLabel . FILTER(lang(?itemLabel) = "en") #show only en itemLabels 
}
Try it!
- hth. --Tagishsimon (talk) 09:48, 14 March 2018 (UTC)
SELECT DISTINCT ?numberLabel (count(?numberLabel) as ?count) WHERE {
  ?number wdt:P31/wdt:P279 wd:Q12503 . #find integers
  ?number rdfs:label ?numberLabel . FILTER(lang(?numberLabel) = "en")  #get the integer's en label
  ?item ?property ?numberLabel .  #find items for which the integer's label value is an item property value
  ?item rdfs:label ?itemLabel . FILTER(lang(?itemLabel) = "en") #show only en itemLabels 
}
group by ?numberLabel
having (?count > 20000)
order by DESC(?count)
Try it!
I can haz count & a 20k threshold. That leaves us with <<different data types (e.g. numerical and monolingual string)>>. I'm a bit stumped there, for now. --Tagishsimon (talk) 10:50, 14 March 2018 (UTC)
That's quite interesting already, e.g. in that 8 is more frequent than 4, which is also the case in German and Korean. Thanks! --Daniel Mietchen (talk) 15:08, 14 March 2018 (UTC)
I’d choose another approach:
SELECT ?numberItem ?type ?value (COUNT(*) AS ?cnt) WITH {
  SELECT ?numberItem ?type ?value WHERE {
    ?numberItem wdt:P31/wdt:P279 wd:Q12503; p:P1181/psv:P1181/wikibase:quantityAmount ?numericalValue .
    FILTER(?numericalValue > 9000) . # optional to prevent timeouts
#depending on the statement data type, un-comment one of the four following lines. You may have to adjust the FILTER value above to yield results.
    BIND(?numericalValue AS ?value) . BIND(wikibase:Quantity AS ?type) . # numerical data type properties
#    BIND(STR(?numericalValue) AS ?value) . BIND(wikibase:String AS ?type) . # string data type properties
#    BIND(STR(?numericalValue) AS ?value) . BIND(wikibase:ExternalId AS ?type) . # external-id data type properties; technically also plain strings
#    BIND(STRLANG(STR(?numericalValue), 'en') AS ?value) . BIND(wikibase:Monolingualtext AS ?type) . # monolingualtext data type properties for language code 'en'
  }
} AS %subquery WHERE {
  INCLUDE %subquery .
  ?type ^wikibase:propertyType/wikibase:directClaim ?wdt .
  [] ?wdt ?value .
} GROUP BY ?numberItem ?type ?value HAVING (?cnt > 10)
Try it!
In the subquery, one can un-comment one of the four type castings to numeric, string, external-id, or monolingualtext. However, it times out if any numeric Integer value is looked for, thus I restricted it to values over 9000 with more than 10 occurrences. —MisterSynergy (talk) 12:24, 14 March 2018 (UTC)
Thanks — that seems to do the job to the extent that the timeout limits allow for, and I learned some new bits about how to handle the different data types. Great! --Daniel Mietchen (talk) 19:10, 14 March 2018 (UTC)

Simple tree of upper ontology?

Is there any way to show in a query a simple tree of the top-level tree? Before I used the tree tool, unfortunately now the query times out. Thanks! Micru (talk) 13:21, 15 March 2018 (UTC)

@Micru: Can the Wikidata Graph Builder be of any help? --Larske (talk) 20:58, 15 March 2018 (UTC)
@Micru: Something like this may be useful to gather the items within n steps. There is a 'tree' display mode, but I am not sure what columns it needs to be fed.
# Subclass tree to depth maxIterations

SELECT ?item ?itemLabel ?depth ?parent ?parentLabel WHERE {
  {
     SERVICE gas:service {
       gas:program gas:gasClass "com.bigdata.rdf.graph.analytics.BFS" .
       gas:program gas:linkType wdt:P279 .
       gas:program gas:traversalDirection "Reverse" .	
       gas:program gas:in wd:Q35120 .
       gas:program gas:out ?item .
       gas:program gas:out1 ?depth .
       gas:program gas:out2 ?parent .
       gas:program gas:maxIterations 2 .
     }
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
} ORDER BY ?depth ?parentLabel
Try it!
Jheald (talk) 21:52, 15 March 2018 (UTC)
So. It seems the 'tree' display option just hides stuff, and then lets you show it; not what we were looking for here. The 'graph' display is more what we're after, and has a button to show stuff hierarchically from the left -- but struggles a bit with a solution set of 500 items.
Somebody else may be able to improve on the above. Jheald (talk) 22:09, 15 March 2018 (UTC)
@Larske: Thanks for the heads up, but the Wikidata Graph Builder seems too cluttered to be usable.
@Jheald: I fear that the graph display is going to be equally cluttered. How do I activate it anyway? Micru (talk) 14:21, 16 March 2018 (UTC)
@Micru: Interactively, click on the icon of an eye at the top left of the results panel. 'Graph' is the final option of the drop-down menu.
Alternatively, you can add #defaultView:Graph at the top of the query, for the results to automatically appear in the graph view.
But instead of that, I think that this below (using the tree view that I dismissed previously) may be what you are looking for:
# Subclass tree to depth 3

#defaultView:Tree
SELECT ?root ?rootLabel ?c1 ?c1Label ?c2 ?c2Label ?c3 ?c3Label WHERE {
  VALUES ?root { wd:Q35120  }
  ?c1 wdt:P279 ?root .
  OPTIONAL {
     ?c2 wdt:P279 ?c1 .
     OPTIONAL {
       ?c3 wdt:P279 ?c2 .
     }  
  } 
    
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
} ORDER BY ?c1Label ?c2Label ?c3Label
Try it!
Jheald (talk) 16:10, 16 March 2018 (UTC)
@Jheald: That's exactly what I was looking for! Thanks a lot! Micru (talk) 17:31, 16 March 2018 (UTC)

What is a query that can get a village in India? Thanks. Artix Kreiger (talk) 23:14, 16 March 2018 (UTC)

@Artix Kreiger: In its simplest form, you perhaps want something like:
SELECT ?item ?itemLabel WHERE {
  ?item wdt:P31 wd:Q532 .
  ?item wdt:P17 wd:Q668 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" } .
}
Try it!
which returns about 75,000 hits.
You may want to add additional fields to this, like state, coordinates, Commons category, etc; or select a subset in some way.
It's also possible that there might be similar-ish population centres, not designated as village (Q532).
But 75,000 should keep you going for a little while.  :-) Jheald (talk) 00:07, 17 March 2018 (UTC)
@Artix Kreiger: It will definitely be worth your time to learn Pywikibot so that you can add a bunch of descriptions to an item all in one edit. @Tohaomg:, to whom I suggested also aggregate multiple description additions into single edits per item. Also @Aftabuzzaman: who added descriptions in nine languages with his own bot to Indian village items all in single edits. Mahir256 (talk) 04:35, 17 March 2018 (UTC)
@Mahir256:, hmmm it may be worth it. I'll look into it. Artix Kreiger (talk) 12:51, 17 March 2018 (UTC)

Papers with the most authors

A query, please for academic journal article (Q18918145) (or sub-classes) with the most authors - whether listed as author (P50) or author name string (P2093) - bonus points if you avoid double-counting by checking for repeated series ordinal (P1545). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits

There are so many scientific paper items (and some of them have so many authors), there's real difficulty in getting a query like this to run in the time.
The following, for example, already cut down quite a lot from your request, times out:
SELECT ?paper (COUNT(?author_string) AS ?count) WHERE {
  ?paper wdt:P31 wd:Q13442814.
  ?paper wdt:P2093 ?author_string .
} GROUP BY ?paper
ORDER BY DESC(?count)
LIMIT 100
Try it!
This query may need bespoke tools running on a database dump. (Unless anyone can think of something really clever). Jheald (talk) 23:41, 17 March 2018 (UTC)
@Pigsonthewing: After further thought. Here are all the papers with series ordinal (P1545) > 1000:
SELECT ?paper ?paperLabel ?count
WITH {
  SELECT ?paper WHERE {  
     ?paper wdt:P31 wd:Q13442814.
     ?paper p:P2093/pq:P1545 "1000" .
  }
} AS %papers 
WITH {
  SELECT ?paper (MAX(xsd:integer(?val)) AS ?count) WHERE {
     INCLUDE %papers .
     ?paper p:P2093/pq:P1545 ?val .
  } GROUP BY ?paper 
} AS %author_count
WHERE {
   INCLUDE %author_count .
   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}                  
ORDER BY DESC(?count)
Try it!
Jheald (talk) 00:03, 18 March 2018 (UTC)
That's a neat trick! Thank you. Combined Measurement of the Higgs Boson Mass in p p Collisions at √s=7 and 8 TeV with the ATLAS and CMS Experiments (Q21558717) is the "winner", with over 5K authors and a >3Mb page-size. I can't open it properly, on a Dell XPS! Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 00:15, 18 March 2018 (UTC)
@Pigsonthewing: Unfortunately there seems to be some gaps in the series ordinal (P1545) sequence. For the Combined Measurement of the Higgs Boson Mass in p p Collisions at √s=7 and 8 TeV with the ATLAS and CMS Experiments (Q21558717) there are 59 series ordinal (P1545) missing, the first one being 802. So if you want the count, i.e. 5095, rather than the maximum series ordinal (P1545), i.e. 5154, I think you have to use the COUNT function instead of MAX. The query will still ignore papers with the series ordinal (P1545) 1000 missing, like Abstracts from the 3rd International Genomic Medicine Conference (3rd IGMC 2015) : Jeddah, Kingdom of Saudi Arabia. 30 November - 3 December 2015 (Q28552866), and I have no suggestion on how to capture them. --Larske (talk) 14:21, 18 March 2018 (UTC)
There may also be author name string (P2093) without any series ordinal (P1545) at all. In Combined Measurement of the Higgs Boson Mass in p p Collisions at √s=7 and 8 TeV with the ATLAS and CMS Experiments (Q21558717) there seems to be one as the COUNT becomes 5096 if you omit the "/pq:P1545" from the second SELECT in the query. --Larske (talk) 16:48, 18 March 2018 (UTC)

qualifiers used with significant event (P793)

I would like to learn what kind of qualifiers are used and how often with different events listed in significant event (P793) property. I was working on the query below, but could not get it to work. can anybody help? --Jarekt (talk) 16:10, 20 March 2018 (UTC)

SELECT ?prop ?propLabel (count(*) as ?count)
WHERE
{
	?item p:P793 ?statement .
	?statement ?pq ?value .
	?statement ps:P793 wd:Q22340494 .
	?prop wikibase:qualifier ?pq .
	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" } .
}
group by ?prop ?propLabel 
order by desc(?count)
LIMIT 100
Try it!
Please note the "List of qualifiers" in the top right of the property talk page. Sjoerd de Bruin (talk) 17:11, 20 March 2018 (UTC)
Thanks, that is what I was looking for. --Jarekt (talk) 19:01, 20 March 2018 (UTC)

List of all scientific articles with a PMCID and a title with a trailing asterisk and a dot (*.)

example: https://www.wikidata.org/wiki/Q50429148

@Mahdimoqri: Some timeout problems, hence the limit of 100... you may be able to use the (currently commented out) OFFSET value to choose increments of the set you're after.
SELECT ?item ?itemLabel 
WHERE
{
  ?item wdt:P932 ?value.
  ?item rdfs:label ?itemLabel. FILTER(lang(?itemLabel) = "en")
  FILTER STRENDS(str(?itemLabel), "*.") .
}
# offset 100
limit 100
Try it!
hth --Tagishsimon (talk) 05:20, 21 March 2018 (UTC)
@Tagishsimon: any other fix around the timeout problem? There are more than a thousands such items and I'm getting a timeout with offset=1000 Mahdimoqri (talk) 12:27, 21 March 2018 (UTC)
@Mahdimoqri: not that I can with confidence point you to. Perhaps the Quarry SQL query service will do it, but I'm not familiar that+wikidata. See, for instance this archive discussion, which touched on a similar timeout problem. --Tagishsimon (talk) 18:52, 21 March 2018 (UTC)
@Tagishsimon: thanks for directing me to Quarry - really neat tool! Mahdimoqri (talk) 23:50, 21 March 2018 (UTC)
@Mahdimoqri, Tagishsimon: You're trying to do a string operation on almost 4 million items. That's never going to be quick.
One thing you could try is to see if moving the LIMIT further up the query helps, so something like:
SELECT ?item ?itemLabel
WITH {
  SELECT ?item WHERE {
      ?item wdt:P932 ?value.
  } LIMIT 500000 OFFSET 3500000
} AS %items
WHERE {
  INCLUDE %items .
  ?item wdt:P1476 ?itemLabel. #FILTER(lang(?itemLabel) = "en")
  FILTER STRENDS(str(?itemLabel), "*.") .
}
Try it!
One catch is that I don't know how deterministic SELECT is. If one re-runs the query 8 times with different OFFSETs, will one get 8 completely different sets of results? The standard doesn't guarantee it (in fact warns not to expect it), but we might be lucky.
Unfortunately if we put an ORDER BY in the subselect to force an ordering, that sorting takes too long, and the query times out. Jheald (talk) 20:37, 21 March 2018 (UTC)
I should also flag that I changed to query to look at title (P1476) rather then rdfs:label in the hope that the smaller join might be a little quicker. Jheald (talk) 20:53, 21 March 2018 (UTC)
@Jheald: thanks, this is wonderful! I could parse through all 4 million items using only 4 runs. Putting the result here for reference. Mahdimoqri (talk) 23:50, 21 March 2018 (UTC)
As already indicated by User:Tagishsimon, Quarry is the tool to use here to do the expensive string lookup: quarry:query/15246. It only looks for en-labels ending with *., thus you’d have to filter the Quarry result for PMCID (P932) presence using SPARQL. —MisterSynergy (talk) 21:47, 21 March 2018 (UTC)
@MisterSynergy: thanks for the quarry query. If you have the energy, two questions: 1. can the quarry query be extended to query for PMCID (P932) presence, and 2. if not, what's the suggested mechanism for using SPARQL to winnow a quarry result set. Academic questions for my personal betterment. thx --Tagishsimon (talk) 11:00, 22 March 2018 (UTC)
Without code. 1) you can use pagelinks table. Note, that this isn't always good. That wouldn't know, how the property is used - as property, qualifier or reference. For items it is worse - it won't know, to which property it's used. 2) I usually download full SPARQL dataset (items which uses property X) and do intersection with quarry results. Of course, programmatically, not manually (e.g. with Excel). --Edgars2007 (talk) 11:16, 22 March 2018 (UTC)
No, unfortunately the SQL database does not contain any information about statements. The wb_terms table just contains all labels, descriptions, and aliases in all languages, and SQL string operations are apparently significantly faster than SPARQL string operations. The Quarry took somewhat over one minute to execute, and it tries to find matches in all 33.2M English labels during that time.
To further process these results in SPARQL, you need to transform the output to:
SELECT ?item WHERE {
  VALUES ?item {
    wd:Q42924406 wd:Q45360137 wd:Q43832114 wd:Q50427080 # … and so on; don’t hesitate to put a large number of items here
  }
  ?item wdt:P932 [] .
}
Try it!
You need to put all items from the Quarry into the third line with wd: prefix for each Q-ID, separated by spaces. It is very simple to prepare that line if you download the Quarry result, (e.g. TSV format), put the result to Excel, copy the column containing Q-ID to a (powerful) text editor such as Notepad++ (Q2033), and perform a regex find&replace transformation (replace \r\n by  wd: – mind the leading space in the latter). If \r\n does not work, try \n (this is very likey on Linux systems the case, see en:Newline). The SPARQL query then allows you to work on the results set, and you can do whatever you’d like to with it. —MisterSynergy (talk) 11:23, 22 March 2018 (UTC)
Thanks; I'm obliged to you both, Edgars2007 & MisterSynergy. I'll play :) --Tagishsimon (talk) 13:12, 22 March 2018 (UTC)

Getting the list of taxa described by a given taxonomist, or more generally by the people of an insitution (or a country)

Hello, is it possible to get the list of taxa described by Carl Linnaeus (Q1043), or described by the people of a given insitution (or a country), by using the taxon author (P405) property nested in the taxon name (P225) statement? Thanks in advance for any help, Totodu74 (talk) 14:43, 23 March 2018 (UTC)

@Totodu74: it is - here's an example for Carl Linnaeus (Q1043) ... come back if you want the query developed in any particular direction.
SELECT ?taxon ?taxonLabel 
WHERE 
{
  ?taxon wdt:P31 wd:Q16521 . 
  ?taxon p:P225 ?name . 
  ?name ps:P225 []; pq:P405 wd:Q1043 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
hth --Tagishsimon (talk) 15:23, 23 March 2018 (UTC)
Thanks a lot, this is great! What if I want to know all taxa described by people working in the Paris museum (employer (P108) of taxon author (P405) is Muséum national d'histoire naturelle (Q838691))? Totodu74 (talk) 15:27, 23 March 2018 (UTC)
@Totodu74: so that would seem to be:
SELECT ?taxon ?taxonLabel ?author ?authorLabel
WHERE 
{
  ?taxon wdt:P31 wd:Q16521 . 
  ?taxon p:P225 ?name . 
  ?name ps:P225 []; pq:P405 ?author .
  ?author wdt:P108 wd:Q838691 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
--Tagishsimon (talk) 15:49, 23 March 2018 (UTC)
This is perfect. Thank you so much! Totodu74 (talk) 16:07, 23 March 2018 (UTC)

All items with multiple instances of same statement

Example:

 – The preceding unsigned comment was added by Mahdimoqri (talk • contribs) at 19:41, 23 March 2018‎ (UTC).

Be aware that some items should have more than one instance of the same statement if the qualifiers are different, eg: somebody who became mayor, left office, then later became mayor again.
In principle a query like this can do it:
SELECT ?item ?prop ?value (COUNT(DISTINCT(?stmt)) AS ?count) WHERE {
  ?item wdt:P921 wd:Q205764 .
   ?prop wikibase:claim ?p .
   ?prop wikibase:statementProperty ?ps .
   ?item ?p ?stmt .
   ?stmt ?ps ?value .
}
GROUP BY ?item ?prop ?value
HAVING (?count > 1)
Try it!
The trouble is, you're wanting to compare every single statement on every single item with every other one.
That can be done if we do something to make the domain we're interested in very very narrow -- eg the statement
?itemmain subject (P921)endometriosis (Q205764)
above, designed to capture the original example you gave.
But to run that same query over eg all 4 million scientific papers is very hard to achieve in 60 seconds. Jheald (talk) 20:10, 23 March 2018 (UTC)
Still, here are some repeated statements for mayors:
SELECT ?item ?itemLabel ?prop ?propLabel ?value ?valueLabel ?count WHERE {
  {
    SELECT ?item ?prop ?value (COUNT(DISTINCT(?stmt)) AS ?count) WHERE {
       hint:Query hint:optimizer "None".
       ?item wdt:P39 wd:Q30185.
       ?item ?p ?stmt .
       ?prop wikibase:claim ?p .
       ?prop wikibase:statementProperty ?ps .
       ?stmt ?ps ?value .
    }
    GROUP BY ?item ?prop ?value
    HAVING (?count > 1)
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
Jheald (talk) 20:18, 23 March 2018 (UTC)
Thanks Jheald (talkcontribslogs) I will explore this in more detail.

Most common title of paintings

Hi,

I would like to see the result of this query: the most common titles of paintings, based on the label in English. Thanks in advance! Tubezlob (🙋) 12:09, 25 March 2018 (UTC)

SELECT ?itemLabel (COUNT(?item) as ?count) WHERE {
  ?item wdt:P31 wd:Q3305213 .
  ?item rdfs:label ?itemLabel .
  FILTER (lang(?itemLabel)="en")
  } GROUP BY ?itemLabel
ORDER BY DESC(?count)
Try it!

Mbch331 (talk) 13:00, 25 March 2018 (UTC)

Thanks a lot Mbch331! Tubezlob (🙋) 13:01, 25 March 2018 (UTC)

Scientific Journals with more than 3 stated ISSNs

Examples:

Mahdimoqri (talk) 17:22, 25 March 2018 (UTC)

SELECT ?item ?itemLabel ?count WHERE {
  {
    SELECT ?item (COUNT(DISTINCT ?ISSN) AS ?count) WHERE {
      ?item wdt:P31 wd:Q5633421.
      ?item wdt:P236 ?ISSN
    }
    GROUP BY ?item
  }
  FILTER ( ?count > 3 )
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY DESC(?count) ?itemL
Try it!

Mahdimoqri (talk) 17:22, 25 March 2018 (UTC)

Bengali Wikipedia articles with no Wikidata statement

I would like to get a list of Bengali Wikipedia articles with no Wikidata statement. Thanks. Bodhisattwa (talk) 12:59, 27 March 2018 (UTC)

SELECT ?item ?itemLabel ?sl ?article
WHERE
{
  ?item wikibase:statements 0 .
  ?article schema:about ?item ; schema:isPartOf <https://bn.wikipedia.org/> .
  ?item wikibase:sitelinks ?sl .           
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,bn". }
}

Try it!

@Bodhisattwa:
--- Jura 13:23, 27 March 2018 (UTC)

This may also be of use - petscan listing of .bn articles with no wikidata item ... these will not feature in Jura's report. --Tagishsimon (talk) 13:30, 27 March 2018 (UTC)
Thanks a lot @Jura1, Tagishsimon: -- Bodhisattwa (talk) 16:13, 27 March 2018 (UTC)
@Bodhisattwa: For articles without items, you might want to look at https://tools.wmflabs.org/wikidata-todo/duplicity.php?wiki=bnwiki&mode=stats as well.
--- Jura 18:38, 27 March 2018 (UTC)
@Jura1:, sure, thanks.. Bodhisattwa (talk) 07:04, 28 March 2018 (UTC)

Pedido para agregar información en la mini base de datos de Q6152160 (Jane Duncan)

@AnselmiJuan: Question removed to Project Chat --Tagishsimon (talk) 15:12, 28 March 2018 (UTC)

BTW, there are project chat pages for languages other than English.
--- Jura 06:47, 29 March 2018 (UTC)

Number of items with at least one Wikipedia page

I want to learn to what percentage the current Wikidata items are connected to Wikipedia pages. Probably this already has been evaluated serveral times, but I was not able to find any evidence. What I tried is

# Count Wikidata items with Wikipedia pages
#
PREFIX schema: <http://schema.org/>
#
select (count(distinct ?item) as ?itemsWithWpPages)
where {
  ?page schema:about ?item .
  filter(contains(str(?page), 'wikipedia'))
}
Try it!

Unfortunately, that times out. Adding a hint:Query hint:optimizer "None" . did not help. Neither did loading the "truthy" dump into a custom endpoint - since this dump lacks the statements about pages. (Jneubert - forgot to sign)

You may find User:Pasleim/Sitelink statistics of interest. It suggests 20,929,414 of 45,878,127 items have no sitelink ... I _guess_ both figures include redirects & if so, would be interesting to see stats removing them. --Tagishsimon (talk) 15:46, 28 March 2018 (UTC)
Thank you so much for the hint (and thanks @pasleim: for creating this in the first place), this is very helpful. The number of items without Wikipedia sitelink will be somehow larger. Disambiguation pages shouldn't count, too. So for now I'd very roughly estimate that already half of the Wikidata items or more are not linked to a Wikipedia content page. Exact numbers and time series would be great, but that would probably require much deeper diving. Jneubert (talk) 17:47, 28 March 2018 (UTC)
The number of 7,604,149 sitelinks to enwiki vs. 5,599,691 articles announced today on the enwiki home page gives an idea about the overhead. Jneubert (talk) 17:51, 28 March 2018 (UTC)
Our items link to Categories, Templates, etc; not just to articles. Still, the 7.6M vs 5.6M figures are interesting. --Tagishsimon (talk) 00:13, 29 March 2018 (UTC)

Optimization needed for artist dupe detection

Do you have any advice to speed up the following query?
(It's from here and I'd like to use a version of it that checks subclasses of visual artist (Q3391743) instead of just painter (Q1028181) (and improve matching in other ways), now it's between 20 and 65 seconds, but then it times out.)

SELECT DISTINCT ?itemName ?item ?dob1 ?dod1 ?item2 ?dob2 ?dod2 ?dobdiff ?doddiff ?meandiff WHERE {
  ?item (wdt:P106/wdt:P279*) wd:Q1028181. # painter or any subclass thereof
  { ?item rdfs:label ?itemName. } UNION { ?item skos:altLabel ?itemName. }
  FILTER((LANG(?itemName)) = "en")
  ?item2 (wdt:P106/wdt:P279*) wd:Q1028181.
  { ?item2 rdfs:label ?itemName. } UNION { ?item2 skos:altLabel ?itemName. }
  FILTER((STR(?item)) < (STR(?item2)))
  MINUS { ?item wdt:P1889 ?item2. } # different from
  MINUS { ?item2 wdt:P1889 ?item. }
  MINUS { ?item wdt:P460 ?item2. } # said to be the same as
  MINUS { ?item2 wdt:P460 ?item. }
  MINUS { ?item wdt:P3373 ?item2. } # sibling
  MINUS { ?item2 wdt:P3373 ?item. }
  
  OPTIONAL { ?item wdt:P569 ?dob1. }
  OPTIONAL { ?item2 wdt:P569 ?dob2. }
  BIND((CEIL(ABS((?dob1 - ?dob2) / 3.652425))) / 100 AS ?dobdiff)
  
  OPTIONAL { ?item wdt:P570 ?dod1. }
  OPTIONAL { ?item2 wdt:P570 ?dod2. }
  BIND((CEIL(ABS((?dod1 - ?dod2) / 3.652425))) / 100 AS ?doddiff)
  
  BIND(BOUND(?dobdiff) AS ?dobdiffbound)
  BIND(BOUND(?doddiff) AS ?doddiffbound)

  BIND(IF(BOUND(?dobdiff), ?dobdiff, 0) AS ?dobdiffsafe)
  BIND(IF(BOUND(?doddiff), ?doddiff, 0) AS ?doddiffsafe)
  BIND((CEIL((?dobdiffsafe + ?doddiffsafe) / (xsd:integer(?dobdiffbound) + xsd:integer(?doddiffbound)) * 100) / 100) AS ?meandiff)
}
ORDER BY (!(?dobdiffbound) && !(?doddiffbound)) ?meandiff ?dobdiffbound ?dobdiff ?doddiffbound ?doddiff ?item
LIMIT 500
Try it!

Thanks a lot in advance, --Marsupium (talk) 14:32, 29 March 2018 (UTC)

@Marsupium: This may or may not be the route to go...
SELECT DISTINCT ?itemName ?item ?dob1 ?dod1 ?item2 ?dob2 ?dod2 ?dobdiff ?doddiff ?meandiff with 
{
select distinct ?item ?item2 ?itemName WHERE {
  ?item wdt:P106 ?occ .
  ?occ wdt:P279* wd:Q3391743 .  
  { ?item rdfs:label ?itemName. } UNION { ?item skos:altLabel ?itemName. }
  ?item2 wdt:P106 ?occ2 .
  ?occ2 wdt:P279* wd:Q3391743 .
  { ?item2 rdfs:label ?itemName. } UNION { ?item2 skos:altLabel ?itemName. }
  FILTER((LANG(?itemName)) = "en")
}
LIMIT 10000
} as %items
where {
include %items
  FILTER((STR(?item)) < (STR(?item2)))
  MINUS { ?item wdt:P1889 ?item2. } # different from
  MINUS { ?item2 wdt:P1889 ?item. }
  MINUS { ?item wdt:P460 ?item2. } # said to be the same as
  MINUS { ?item2 wdt:P460 ?item. }
  MINUS { ?item wdt:P3373 ?item2. } # sibling
  MINUS { ?item2 wdt:P3373 ?item. }
  
  OPTIONAL { ?item wdt:P569 ?dob1. }
  OPTIONAL { ?item2 wdt:P569 ?dob2. }
  BIND((CEIL(ABS((?dob1 - ?dob2) / 3.652425))) / 100 AS ?dobdiff)
  
  OPTIONAL { ?item wdt:P570 ?dod1. }
  OPTIONAL { ?item2 wdt:P570 ?dod2. }
  BIND((CEIL(ABS((?dod1 - ?dod2) / 3.652425))) / 100 AS ?doddiff)
  
  BIND(BOUND(?dobdiff) AS ?dobdiffbound)
  BIND(BOUND(?doddiff) AS ?doddiffbound)

  BIND(IF(BOUND(?dobdiff), ?dobdiff, 0) AS ?dobdiffsafe)
  BIND(IF(BOUND(?doddiff), ?doddiff, 0) AS ?doddiffsafe)
  BIND((CEIL((?dobdiffsafe + ?doddiffsafe) / (xsd:integer(?dobdiffbound) + xsd:integer(?doddiffbound)) * 100) / 100) AS ?meandiff)
# SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
ORDER BY (!(?dobdiffbound) && !(?doddiffbound)) ?meandiff ?dobdiffbound ?dobdiff ?doddiffbound ?doddiff ?item
Try it!
--Tagishsimon (talk) 21:32, 29 March 2018 (UTC)
@Tagishsimon: Thanks a lot! This narrows down results, but better than nothing and I try to minimize the limit! --Marsupium (talk) 14:03, 30 March 2018 (UTC)

Country name in a given language & in Bengali

I need a query which will give me all country name in a given language & in Bengali. For example i want all country in Italian & in Bengali, so result should be like this:

it bn
Italia ইতালি
Bangladesh বাংলাদেশ

I don't need Q number. First column should be in given language. I need this to create wiktionary entry for Bengali wiktionary. --Aftabuzzaman (talk) 21:31, 30 March 2018 (UTC)

Have a look at Wikidata:Lists/nomes de países em português.
--- Jura 21:38, 30 March 2018 (UTC)
@Aftabuzzaman: Here's a report for IT & BN, limited to countries that are members of the United Nations ... the definition of 'country' needs to be pinned down for the purposes of the query ... other options exist - see, for instance, the P31 claims for Russia (Q159). The report can be amended given your feedback. Change the language parameters in the filter statements for different languages. Finally, note that this report requires the item to have a label in both languages; again, it can be changed should you wish for countries which have labels in only one of the languages.
select distinct ?countryLabel1 ?countryLabel2  WHERE {
  ?country wdt:P463 wd:Q1065 .    # item is a member of the United Nations
  ?country rdfs:label ?countryLabel1. 
  FILTER((LANG(?countryLabel1)) = "it")
  ?country rdfs:label ?countryLabel2. 
  FILTER((LANG(?countryLabel2)) = "bn")
}
order by ?countryLabel2
Try it!
hth --Tagishsimon (talk) 23:16, 30 March 2018 (UTC)
@Tagishsimon: Thank you. Aftabuzzaman (talk) 00:14, 31 March 2018 (UTC)