Wikidata:Property proposal/named place on map

named place on map

edit

Originally proposed at Wikidata:Property proposal/Place

Motivation

edit
 
1576 Saxton map of Essex. It names a large number of places.

In March 2021 WikiProject EMEW's partners, the Viae Regiae (Q105547906) project, will start a major drive to transcribe and identify all the places and placenames on several series of 16th and 17th century maps, like the 1576 Saxton map of Essex to the right. (They will actually be using a higher-resolution copy, which we will be uploading).

As can be seen from the map, the number of places on it is very large; the process may generate of the order of 1000 located places per map, which we will be recording as Structured Data statements on Commons.

It would be good to have a property other than depicts (P180) to record this information. In the map to the right, the appropriate value of depicts (P180) = Essex (Q23240). The settlements of West Ham, Leyton, and Wanstead only appear as 3 small places out of a thousand, close to the border with Middlesex. For user convenience, efficiency of querying, and clarity, it is useful not to overload the main depicts (P180) statement with this information, and still less use main subject (P921) but to keep it segregated and record it separately with a different property -- the relationships are of a different nature. We shall also be exploring whether it is possible to emit the information again in a IIIF manifest, a test of the ability of Structured Data on Commons to support annotation at scale.

The statements will be accompanied by the qualifiers object named as (P1932), to record the name as given on the map, and relative position within image (P2677) to record where on the map the image locates the place.

A question still to be evaluated is whether SDC and the SDC user interface will be able to cope well with such a large number of statements, all of the same kind. The technical view on the phabricator ticket raised by the WikiProject on this point (phab:T275286) seems to be "try it and see". But it may be that when there are a very large number of statements of a particular kind, it would be advantageous to put them in a collapse-box. This would be another reason to keep the principal depicts (P180) statements separate from the statements for named places, so they would remain visible.

However the main purpose of this property proposal is simply to have a property other than depicts (P180) to start trying to record this information.

Discussion

edit
Wikimaps facebook group notified of property proposal: link
  •   Comment Interesting use of Wikidata. Personally, I'd try to do it on Wikidata (not SDC), but even here the number of statements might eventually become problematic through the GUI.
    What happens when the place name cane be deciphered, but not matched (exactly or at all) to an item? Should the value be the name and the qualifier the mapping to a Wikidata about the place? --- Jura 16:42, 2 March 2021 (UTC)[reply]
Another alternative could be to record it directly as a Lexeme form. This has the advantage that place can be determined in a clearly separate step. --- Jura 16:52, 2 March 2021 (UTC)[reply]
@Jura1: Useful Qs. If the place cannot be identified, or has no wikidata item, we can use ?map "named place on map" <somevalue> / "stated as" ?name_on_map in the usual way.
Putting the identified place on the main statement, rather than on a qualifier, seemed the right way to go, to make it a couple of lines easier to write queries like "which of these maps include this place" (even though they may name it differently -- spelling was very variable in this period).
The reason to go for Commons and SDC, in the first instance, was to be able to hang the position qualifier off the statement, which may be specific to a particular digitisation of the map. But you'll notice that the property proposal specifies for use on files or items, and we're thinking about how high up the individual copy -> edition -> work hierarchy it might be useful also to record this information. Jheald (talk) 18:53, 2 March 2021 (UTC)[reply]
@Jura1: re-pinging, because I think I didn't get it right before (forgot the '1' on your username). Jheald (talk) 21:47, 2 March 2021 (UTC)[reply]
  • Instead of "try it and see" with SDC, you could just check any image with a large number of statements and see how it loads (or doesn't). I vaguely recall that already 20 was slow. In either place, you probably need to edit it with some other tool.
About the comparison qualifier vs. main statement: a point to consider is also that place names that can refer to several places (maybe less a problem for maps than place names in works in general).
@Susannaanas: might be interested. --- Jura 13:52, 5 March 2021 (UTC)[reply]
@Jura1: On the ticket (phab:T275286) Lucas looks at c:File:Nature Timespiral.png, which currently has the largest number of SDC statements of any file on Commons, and finds it all seems to work. Jheald (talk) 16:22, 5 March 2021 (UTC)[reply]
Good for him. Might just be my browser. --- Jura 17:24, 5 March 2021 (UTC)[reply]
  •   Comment Sorry for taking the time to follow up. I think this is an interesting approach. I wonder if there are properties for annotations that could be put together with this, or whether it is a good idea to keep the place candidates separated from the start. I am not well enough up-to-date on annotation-related properties, but I am sure if you and the team have thought about this and ended up in this solution, it should be good enough. Could you sketch out a sample annotation with object named as (P1932) and relative position within image (P2677)? I also think that maps could be stored in Wikidata, and the annotations would refer to a map original rather than an image copy. But it seems you have both options. – Susanna Ånäs (Susannaanas) (talk) 14:20, 5 March 2021 (UTC)[reply]
  • I am confused however, that they should be recorded as items. In that case I would also suggest using lexemes instead. – Susanna Ånäs (Susannaanas) (talk) 14:27, 5 March 2021 (UTC)[reply]
@Jura1, Susannaanas: so a couple of examples for this map, with qualifiers:
File:Essexiae... "named place on map" West Ham (Q939617) / relative position within image (P2677) = "pct:15.2,85.0,0.5,1.0" / object named as (P1932) = "W Ham"
File:Essexiae... "named place on map" Barking (Q377720) / relative position within image (P2677) = "pct:20.5,84.7,1.4,1.3" / object named as (P1932) = "BARKINGE"
For me the lexeme idea doesn't work at all. The names on the map are strings, not forms of dictionary words. Creating Lexeme items for these particular spellings would have no particular value. It is useful to be able to retrieve the strings as strings in a query, and that can be done.
Secondly, use of {{P|2677}. For this image P2677 makes sense, as places are recorded using glyphs of different sizes. (See here for the project's current rough typology, which is likely to evolve.) Specifying the bounding box allows a crop just of the glyph to be retrieved easily. If the information were available, one might additionally specify region within image (P8276) to specify a polygon around the glyph. That information is not currently planned to be retrieved, but might be extracted at a future stage by machine learning methods.
For other maps it might make sense to record the position of just a point on the map rather than a box or a polygon. Unfortunately there appears to be no property to do that at present.
The design is essentially identical to how annotations are expected to be stored on Commons. (At least it seems so, depending whether anybody has done thought on this). cf eg the use of relative position within image (P2677) on File:ISSSpaceFoodOnATray.jpg
As to why target the files on Commons first, (i) if it breaks things, we don't want to break wikidata. Breaking presentation of Commons SDC is expendable, because nobody uses it for anything. Breaking wikidata is not; and (ii) as already stated, the glyphs are small objects. If the image (P18) value is changed to a new improved different digitisation, that is cropped even slightly differently, none of the boxes will fit found the glyphs any more. Therefore it makes more sense, at least to start with, to put the information on the SDC of the image. But we are thinking about what also to store on wikidata -- User:PKM in particular, who is developing the data model for the wikidata items for the maps (see Wikidata:WP EMEW/Sources).
Hope this helps to clarift our thinking a little bit. Jheald (talk) 16:16, 5 March 2021 (UTC)[reply]
  Support I think I must have misunderstood a little, and this looks much more straightforward than what I first thought. So the values are geographic places, not place names or candidates for places or place names. I can see value in it being a separate property. – Susanna Ånäs (Susannaanas) (talk) 16:55, 5 March 2021 (UTC)[reply]