Property talk:P709

Latest comment: 1 year ago by Jheald in topic Errors and anomalies

Documentation

Historic Environment Scotland ID
identifier for a building in the Historic Environment Scotland database
DescriptionHB Number in Historic Scotland database
Associated itemHistoric Environment Scotland (Q21997561)
Has qualityidentifier (Q853614)
Data typeExternal identifier
Template parameterTemplate:Hbnumber (Q14089336)
Domain
According to this template: geographic location (Q2221906)
According to statements in the property:
geographic location (Q2221906), geographical feature (Q618123) or sculpture (Q860861)
When possible, data should only be stored as statements
Allowed values[A-Z]?[A-Z][A-Z][0-9][0-9]*
ExampleTantallon Castle (Q57803)SM13326
Torosay Castle (Q129472)LB17975
Kailzie Gardens (Q15232541)GDL00229
Sourcehttp://hsewsf.sedsh.gov.uk/hslive/hbsearch.show
Formatter URLhttps://portal.historicenvironment.scot/designation/$1
Robot and gadget jobsDeltaBot does the following jobs: http://www.britishlistedbuildings.co.uk/sc-$1
Related to country  United Kingdom (Q145) (See 324 others) (Scotland (Q22))
See alsoHighland Historic Environment Record ID (P7304), Dictionary of Scottish Architects building ID (P7630), POWiS ID (P7659), Aberdeenshire HER ID (P7694), NatureScot Sitelink ID (P10015), Standing Waters Database ID (P10051)
Lists
Proposal discussionProposal discussion
Current uses
Total77,535
Main statement77,41599.8% of uses
Qualifier4<0.1% of uses
Reference1160.1% of uses
Search for values
[create Create a translatable help page (preferably in English) for this property to be included here]
Format “[A-Z]?[A-Z][A-Z][0-9][0-9]*: value must be formatted using this pattern (PCRE syntax). (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P709#Format, SPARQL
Item “country (P17): United Kingdom (Q145): Items with this property should also have “country (P17): United Kingdom (Q145)”. (Help)
List of violations of this constraint: Database reports/Constraint violations/P709#Item P17, hourly updated report, search, SPARQL
Single value: this property generally contains a single value. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P709#Single value, SPARQL
Distinct values: this property likely contains a value that is different from all other items. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P709#Unique value, SPARQL (every item), SPARQL (by value)
Allowed entity types are Wikibase item (Q29934200): the property may only be used on a certain entity type (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P709#Entity types
Scope is as main value (Q54828448), as reference (Q54828450): the property must be used by specified way only (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P709#Scope, SPARQL
 
This property is being used by:

Please notify projects that use this property before big changes (renaming, deletion, merge with another property, etc.)


Invalid IDs and URL formatter edit

Seems like Historic Scotland reorganized their database. To each ID they added a prefix, e.g. LB, SM and the URL formatter changed to http://portal.historic-scotland.gov.uk/designation/$1 Does somebody know how to derive the prefix or if there is a formatter URL to access the data without knowing this prefix? --Pasleim (talk) 12:53, 18 April 2016 (UTC)Reply

Anybody knows something? --Pasleim (talk) 15:38, 9 February 2017 (UTC)Reply
The co-ordinate conversion seems to be pretty borked as well -- or, at least, I've just fixed three out of three that were completely wrong. May be worth a re-upload of the whole data set, with the new prefixes, and re-done coordinates. The dataset appears to be here
  • I've started work on this. The format constraint now supports two characters at the start (LB, SM) and scheduled monuments are an accepted type (since they seem to be in the same database). The database also seems to include prefixes for battlefields (BTL), gardens (GDL), & historic marine protected areas (HMPA) but I'll leave those off for now - there's not many of them, thankfully.
The next step is going to be re-adding the prefixes. At the moment, out of ~69600 uses of P709, ~20300 have LB prefixes, ~7000 have SM prefixes, the rest are purely numeric. Anything larger than 14000 is a listed building (LB) - SM numbers don't go that high. This covers ~31000 of the numeric items and I'll run a batch job to convert those IDs at some point in the next day or two (unless anyone objects).
Unfortunately, lower than that, some numbers do clash - eg there is an SM13668 (airfield bomb stores at Wick) and an LB13668 (cottages near Stirling), and there are about 11000 items with a numeric ID in this range. We might be able to work out the correct prefix using the heritage designation property.
@Jheald, Thierry Caro, Stinglehammer, Pasleim: who might have thoughts on this (anyone else?). Andrew Gray (talk) 21:07, 25 September 2018 (UTC)Reply
@Andrew Gray: Can we identify the right one by coordinates, if there are two possible choices? Or is Pasleim right that they're not good? Jheald (talk) 21:29, 25 September 2018 (UTC)Reply
@Jheald: IME our coordinates are mostly good, but there are some weirdnesses. Maybe enough to make it tricky. One benefit is that when they've been imported wrongly they seem to be massively wrong - moving something from Dumfries to Uist - so if we can dig out a report which compares coordinates to P131 we might get somewhere. I think I remember seeing this for England a year or two ago? Andrew Gray (talk) 21:33, 25 September 2018 (UTC)Reply
Okay, I've got the first ~31000 updates (all xxxxx -> LBxxxxx, numbers outside the SM range) ready. I'll get those running overnight. Andrew Gray (talk) 22:26, 25 September 2018 (UTC)Reply
Update - my initial assumption was wrong, as some newer scheduled monuments are in the SM90xxx range. However, there's no LB90xxx buildings, so these were easy enough to fix. We now have:
  • 51401 LB items (34830 unique against a total of 47489)
  • 6988 SM items (6985 unique against a total of 8270)
  • 11197 number-only items (8482 unique)
So we've got about the right number of IDs, more or less, but we still have to assign a good chunk of them to LB/SM status, and a lot of them are fragmented into multiple items (this is usually the case where, eg, a single listing covers multiple items; I don't know if we should merge these or not but I'm not going to get into that now...)
For the remaining number-only items, I'm going to make an inference based on their listed status - if they're A/B/C then the code goes LM, if they're scheduled then SM. That takes care of just over 10000 entries leaving >1000 to resolve in a third phase. Andrew Gray (talk) 20:38, 27 September 2018 (UTC)Reply
I've picked this up again (a nice Sunday-afternoon task). It looks like most have been resolved since last year, and there are currently about 200 items which have a HS ID but no recorded listed status, and just 27 which have a non-prefixed ID. Andrew Gray (talk) 15:08, 14 July 2019 (UTC)Reply
All non-prefixed IDs now cleaned up (some removed as spurious). I'll see about comparing our data to the HS downloads and see what we have missing. Andrew Gray (talk) 15:48, 14 July 2019 (UTC)Reply

Dead link edit

I've removed the formatter URL, as the site is dead; and not archived at archive.org Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:41, 18 August 2016 (UTC)Reply

Multiple Ids edit

An Object can have multible Id as a listed builiding or scheduled monument: see Q17775140 - Or is there womething seperate for scheduled monuments? Agathoclea (talk) 18:28, 1 January 2019 (UTC)Reply

This property covers scheduled monuments as well (SM prefix numbers). Note that in some cases a scheduled monument and a listed building at the same site may not quite be the same thing (eg Remains of Castle Donnan (Q31099913) & Eilean Donan Castle (Q20816698) Andrew Gray (talk) 22:47, 1 January 2019 (UTC)Reply

Notes on HS items, 2019 edit

Hi all,

I've been looking into the current state of Scottish historic sites on Wikidata, and what follows is some notes (in no particular order) on the issues with them:

  • We are probably missing a few thousand listings - there are 55,806 listed buildings & scheduled monuments, but we have 50754 distinct IDs.
  • A large number of items are missing Historic Scotland IDs (Historic Environment Scotland ID (P709)) or heritage designation (P1435), occasionally both; many of these stem from Wikipedia imports. Some of these will be data gaps, but several hundred will be duplicate items needing merged.
  • Item titles are often confusingly worded, starting with a street address or village name, or in the case of some scheduled monuments, a nearby location.
  •   Resolved(?) Many items had missing or inconsistent P131. @Tagishsimon: has fixed most (all?) of these so that they all have a consistent P131 pointing to the relevant council area, harvested from Historic Scotland data.
    • Lagely resolved. Valid P131s added for all items with an HS ID which previously lacked P131 and which were listed in the HS dump file. However I've not done a comparison of wikidata P131 versus HS dump file Local Authority looking for errors. (Probably will sometime; not aware that there are problems with our items in this respect)
  • An indeterminate number of items were uploaded with erroneous coordinates associated with entirely different items (I have fixed some of these by hand, but plenty remain). This will probably require a complete reupload of coordinate data, which itself will need to take account of split sites, as...
  • ...lots of items have the same Historic Environment Scotland ID (P709) number (69k records vs 56k IDs). I think this is generally accepted when there are multiple parts to a listing (eg it's a row of houses, or it's a group of outbuildings listed together) but it would be good to have some way of showing "this is an intentional duplicate" vs "actually, maybe these two should be merged". Presumably we could do this with qualifiers on the ID to indicate part 1, part 2, etc but I do not know if there is a preferred or consistent approach here.
  • Over the past few years, many listed buildings have had their listing (LBxxxx) withdrawn and replaced by a scheduled monument listing (SMyyyy) - our data does not always reflect this consistently. We might have one ID/heritage status but not the other, or both without start/end dates, or both but on separate items. Similarly some may have been relisted from C to B, etc, which we may not have captured.
  • Many prominent "listed buildings" do not have a single listing status - eg Edinburgh Castle (Q212065) is actually twenty separate listings with different IDs, which are not clearly linked together. The entire site is, however, a single scheduled monument. In other cases, the scheduling may apply to "all of a site other than the listed building", eg Eilean Donan Castle (Q20816698) & Remains of Castle Donnan (Q31099913). We need to work out a clear way to express these relationships.
  • (added) Early imports seem to have sometimes duplicated coordinates/labels for single items, but given different IDs - eg this and this (the second is actually #7, not #1). This query finds items with exactly matching coordinates but different IDs. Unlike the erroneous coordinates above, these all seem to be items close to each other, so probably a different issue from a technical standpoint.

@Tagishsimon, Jheald, NavinoEvans: - is there anything I've missed in this list? Andrew Gray (talk) 18:47, 19 September 2019 (UTC)Reply

@Andrew Gray: What's the current status on mixed-up coordinates? You were having a look at these. Do we think most coordinates are now reliable? Jheald (talk) 18:53, 19 September 2019 (UTC)Reply
@Jheald: So I thought I'd fixed all of them by comparing to P131, but then Tagishsimon managed to find another few thousand P131s so we went back to square one :-). I am churning through those again. I think most (as in maybe 95-99%) are reliable, which is enough to say "okay, good enough for now", but it probably would not hurt to do a mass re-upload of coordinates with clear referencing if we're also fixing other things (eg the split items). After all, better to have the HS coordinates than random imported-from-WP ones. Andrew Gray (talk) 19:42, 19 September 2019 (UTC)Reply
@Andrew Gray: Yes -- if the HS ID is attached to the correct item (or, more accurately a not-incorrect item, even if not the absolute best item). Is that something we can now broadly rely on? Jheald (talk) 20:10, 19 September 2019 (UTC)Reply
@Jheald: I'm a little less confident than I was after digging into the depths of the weirder ones, but I think I'll stick with "most are right". I certainly don't think we're going to find that five or ten thousand are wrong, but a thousand or so having some kind of problem wouldn't surprise me. Andrew Gray (talk) 21:33, 19 September 2019 (UTC)Reply

Parent-Child items edit

I'm not super-happy with the structure of what I'll call parent-child items. Four main issues, IMO: 1) absence of a parent item and 2) child items wrongly sporting an HS ID. 3) parent item having multiple coordinates 4) absent child items

absence of a parent item

If we take the example of the listing LB1837 which is for "2, 3, 4, 5, 6 CLEPHANTON VILLAGE", we have items for:

We lack an item for the complete heritage ensemble - i.e. for "2, 3, 4, 5, 6 CLEPHANTON VILLAGE". The complete heritage ensemble clearly is a thing deserving an item.

child items wrongly sporting an HS ID

Each of the child items in the above example claims to be LB1837 by having a Historic Environment Scotland ID (P709). None of them (on their own) are LB1837. LB1837 is the complete ensemble. P709 is sometimes used to hold the HS dump file sequence number using the qualifier series ordinal (P1545).

My preference is to have

parent item having multiple coordinates

Such as Dundee, Broughty Ferry, Beach Crescent, Lamp Standard (Q17813353) which has 1 coord for each of 9 lamp standards. We should be aiming for a single coord per item (and there's an argument to be made that if an ensemble item has children, coords should be on the children and not on the parent.)

absent child items

The other side of the Dundee, Broughty Ferry, Beach Crescent, Lamp Standard (Q17813353) example is that it lacks child items for each lamp standard, which would correspond with rows in the HS dump file. Where we have granular data - i.e. the HS dump file - we should have corresponding child items.

(In other news, I'm still petscanning for Scottish places with en wiki articles, which lack items or have items lacking P131; and adding coords to items. I have code to do the easting/northing to WGS84 exercise & will sometime soon try to marry-up our items with the HS dump, with a view to doing the coordinate comparison.) --Tagishsimon (talk) 02:45, 24 September 2019 (UTC)Reply

Historic Env Scotland provide detail of the components making up a cultural heritage ensemble listing, so a key for the ID is a compound of P709 plus a series ordinal qualifier ... such qualifiers are currently being added, hopefully across the board. --Tagishsimon (talk) 08:05, 7 October 2019 (UTC)Reply
However, creating items for the overall listed site seems reasonable (the whole terrace of houses, the whole country house plus outbuildings, etc) - especially since this is what we usually have described in other sources or by import from Wikipedia. This would have the main Historic Environment Scotland ID (P709) (perhaps a series ordinal zero, or "applies to part: whole site", or something?) and a single coordinate location (P625) - if HS give an overall coordinate great, but if not we could calculate the centre of all the child coordinates or something.
Interested to find out how scheduled monuments are handled, by the way. My impression is that HS take a much more "site-oriented" approach there and everything gets lumped into one schedule, even if the separate bits are distinct buildings some way apart. We might end up splitting some of them apart manually where WD/WP has two items but HS only have one schedule entry. Dun Telve and Dun Troddan brochs (Q56665581) is one I ran across when tidying last month. Andrew Gray (talk) 10:02, 4 October 2019 (UTC)Reply
Without prejudice to this discussion, I happened on a parent-child set of items (and, btw, parent items are fairly rare; more on that eventually) which I've adorned with part of / has part relations. Might be useful to hone this set as an exemplar, supposing that our model can cope with an item for the cultural heritage ensemble (Q1516079) parent and for the children:
Some of the questions are whether the parent should have a heritage designation (P1435) - the item is the listing, not that which is listed; and ditto a coordinate, P131 &c. Some more sets listed at User:Tagishsimon/junk3 fwiw. --Tagishsimon (talk) 08:05, 7 October 2019 (UTC)Reply

Errors and anomalies edit

Return to "P709" page.