Wikidata:Property proposal/name-suggestion-index identifier
name-suggestion-index identifier
editOriginally proposed at Wikidata:Property proposal/Authority control
Description | identifier for a brand in OpenStreetMap's name-suggestion-index |
---|---|
Represents | Name Suggestion Index (Q62108705) |
Data type | External identifier |
Domain | retail chain (Q507619) |
Allowed values | [a-z_]{1,16}\/[a-z_]{1,48}\ |
Example 1 | McDonald’s (Q38076) → amenity/fast_food|McDonald's |
Example 2 | Shell (Q154950) → amenity/fuel|Shell , shop/convenience|Shell , amenity/fuel|เชลล์ |
Example 3 | United States Postal Service (Q668687) → amenity/post_office|United States Post Office |
Source | [1] |
External links | Use in sister projects: [ar] • [de] • [en] • [es] • [fr] • [he] • [it] • [ja] • [ko] • [nl] • [pl] • [pt] • [ru] • [sv] • [vi] • [zh] • [commons] • [species] • [wd] • [en.wikt] • [fr.wikt]. |
Planned use | Upon approval, this property will be mentioned in name-suggestion-index's contributing guide and the brand:wikidata key's official documentation, and NSI contributors will immediately begin adding the property to some existing items that have been deleted by mistake in the past. |
Number of IDs in source | 5,747 identifiers, 5,213 that have corresponding Wikidata items as of 86be8c6fd568c9757ec1816362a58ffe13df0adf |
Expected completeness | always incomplete (Q21873886) |
Formatter URL | https://nsi.guide/?id=$1 |
See also |
|
Motivation
editname-suggestion-index is the OpenStreetMap project's de facto authority for brand-related tagging information. Entries in NSI are presented to mappers as presets to choose from when mapping, alongside unbranded presets like "road", "lake", "restaurant", or "ATM". Most entries were created because NSI scripts flagged certain names as being common supermarket names (for instance) in the main OSM database.
Most entries are for brands that already have Wikidata items, so linking OpenStreetMap with Wikidata is just a matter of adding a brand:wikidata
tag to the entry. It isn't feasible to link Wikidata to every instance of a chain store location in OSM, but it is possible to link to the brand's entry in NSI. It would only be feasible for this user script to query OSM for chain store locations to plot on a map when it knows what kind of business OSM considers the chain to be.
The idea of creating an identifier property for NSI has come up a couple times in the context of undeletion discussions (another instance). I don't think the presence of this property on an item would establish notability by itself, given that OSM isn't considered an authority on store locations. But it could give administrators a little more clarity when assessing whether a business-related item is merely undeveloped or whether it's spam.
– Minh Nguyễn 💬 02:11, 9 November 2019 (UTC)
Discussion
edit- Comment you added links to your examples, and that is obviously helpful, but we should keep in mind that in the current form Wikidata will not be able to generate these links (since it only supports inserting the full statement value in a formatter URL). There are a few options:
- Use a URL datatype and store the whole URLs as values (they can still be restricted to a particular format via a constraint)
- Set up a proxy which accepts values in the format you suggest and translates them to your service (ArthurPSmith can help) - this should only be done if the format you are proposing is already attested somewhere else
- Find another format for which there already exists a service which
accepts these values as part of its URLs
- Let me know if any of this is unclear. − Pintoch (talk) 13:45, 9 November 2019 (UTC)
- Thanks for the suggestions Pintoch! NSI has been using this identifier format on its pages but not in its URLs. I've proposed a change to NSI that would allow us to use a format of
https://nsi.guide/?id=$1
. – Minh Nguyễn 💬 17:27, 9 November 2019 (UTC)- I think it might make sense to leave the string itself unlinked, but put the URI (either of the k&v type or the id type) as a reference. After all, NSI is an authority of what identifiers it uses :) Arlo Barnes (talk) 22:14, 10 November 2019 (UTC)
- Thanks for the suggestions Pintoch! NSI has been using this identifier format on its pages but not in its URLs. I've proposed a change to NSI that would allow us to use a format of
- Support Seems valuable, to be able to access this classification from Wikidata. I would prefer the External ID datatype, especially if NSI URLS can be adapted to accept such values -- linked data is so helpful. And no good reason not to do this upfront, rather than through some semi-hidden reference mechanism. Would the applies to name of subject (P5168) qualifier generally be used to indicate the particular name being identified, or would this be redundant given the structure of the identifier? Jheald (talk) 17:57, 11 November 2019 (UTC)
- @Jheald: Yes, if I understand the qualifier correctly, it would be helpful in the case of international brands like Shell (Q154950) above. – Minh Nguyễn 💬 02:36, 23 November 2019 (UTC)
- Comment @Mxn: Routinely having multiple different strings to "identify" the same entity is not really good practice for an "external identifier", though having the same string "identify" multiple entities is more fatal. I think as Pintoch suggested above you would probably be better off making this a URL datatype, or a string datatype and not worry about linking in the Wikidata UI... ArthurPSmith (talk) 00:16, 24 November 2019 (UTC)
- @ArthurPSmith: I'm afraid I don't follow... even with a URL datatype, a given entity may have multiple NSI URLs, because NSI identifiers partly consist of the local name, which can vary from country to country or from language to language (or writing system to writing system). Would that be a problem? – Minh Nguyễn 💬 01:10, 28 November 2019 (UTC)
- @Mxn: Just that there is usually no expectation for a URL datatype to be single-valued, while an external id generally is (it is the "identifier" for the entity in the database). ArthurPSmith (talk) 12:59, 30 November 2019 (UTC)
- @ArthurPSmith: I'm afraid I don't follow... even with a URL datatype, a given entity may have multiple NSI URLs, because NSI identifiers partly consist of the local name, which can vary from country to country or from language to language (or writing system to writing system). Would that be a problem? – Minh Nguyễn 💬 01:10, 28 November 2019 (UTC)
- Comment And for your future RegEx, I propose:
https:\/\/nsi\.guide\/index\.html\?k=[a-z]{1,10}&v=\w{1,48}#.{1,128}
The one in the proposal doesn't work. Cordially. —Eihel (talk) 02:05, 27 November 2019 (UTC)- Note that the regex in the proposal assumes that [2] will be merged before this proposed property comes into use. – Minh Nguyễn 💬 01:10, 28 November 2019 (UTC)
- @Mxn: An unescaped delimiter must be escaped with a backslash. So, what do you think about
[a-z]{1,10}\/\w{1,48}\|.{1,128}
? —Eihel (talk) 23:09, 13 December 2019 (UTC)
- @Mxn: An unescaped delimiter must be escaped with a backslash. So, what do you think about
- Note that the regex in the proposal assumes that [2] will be merged before this proposed property comes into use. – Minh Nguyễn 💬 01:10, 28 November 2019 (UTC)
The RegEx doesn't work (see Pull request too, in Category.js) and no formatter URL (P1630). —Eihel (talk) 14:51, 18 December 2019 (UTC){{Strong oppose}}
- Conditional support with formatter URL (P1630), it's better, or URL type as suggested above. —Eihel (talk) 04:14, 15 January 2020 (UTC)
- Support I think it's a good idea. Although, I as a collaborator of the NSI project who has had some of their Wikidata entries deleted I might be a little bias. But, I think it should happen anyway. --Adamant1 (talk) 04:47, 8 December 2019 (UTC)
- Support --Tinker Bell ★ ♥ 21:08, 13 December 2019 (UTC)
- Support as a contributor to Wikidata and name-suggestion-index. —Vahurzpu (talk) 05:08, 18 December 2019 (UTC)
- Support marked as ready with Eihel's regex --99of9 (talk) 00:11, 15 January 2020 (UTC)
- Hello @99of9:, RegEx is an internal problem in Wikidata: this does not prevent the introduction of identifiers. Thank you for your change. But the proposal cannot yet lead to a property: the suggestions of ArthurPSmith and Pintoch are not followed and Minh Nguyễn offers us a formatter URL (P1630) instead. Knowing that
https://nsi.guide/?id=amenity/fast_food|McDonald's
doesn't work, we cannot continue, for the moment. It is up to Mxn to make the necessary changes: its initiative or the various proposals above. So, in my opinion, would you be so kind as to re-position the proposal as not available for creation, please? Cordially. —Eihel (talk) 02:30, 15 January 2020 (UTC)- @Eihel: It is not a requirement for all external IDs to have a formatter URL (P1630). I respect your opinion - I like external links too. But I see a consensus here to accept this property even without it. Just like Regex, formatter URLs can be added or edited later. Pintoch's second solution can still be implemented, even if they don't add a direct URL (which it looks like they will). --99of9 (talk) 03:00, 15 January 2020 (UTC)
- @99of9: Mezzo. If the property is ultimately not of type External identifier, but String or URL as suggested, the property should be deleted or modified by the development team according to the directives (discussion beforehand, community acceptance and request). As you can see, this brings complications that we would do without. But if there is consensus… nb: an external identifier which cannot be reached by a link is an aberration, IMHO. —Eihel (talk) 03:22, 15 January 2020 (UTC)
- @Eihel: It is not a requirement for all external IDs to have a formatter URL (P1630). I respect your opinion - I like external links too. But I see a consensus here to accept this property even without it. Just like Regex, formatter URLs can be added or edited later. Pintoch's second solution can still be implemented, even if they don't add a direct URL (which it looks like they will). --99of9 (talk) 03:00, 15 January 2020 (UTC)
- Hello @99of9:, RegEx is an internal problem in Wikidata: this does not prevent the introduction of identifiers. Thank you for your change. But the proposal cannot yet lead to a property: the suggestions of ArthurPSmith and Pintoch are not followed and Minh Nguyễn offers us a formatter URL (P1630) instead. Knowing that
- Comment I'm not really convinced by the approach above, but, beyond splitting it into two properties (amenity and shop), I'm not sure what to suggest .. --- Jura 10:08, 20 January 2020 (UTC)
- comment 2 Maybe use some qualifier (e.g. OpenStreetMap tag or key (P1282)) instead of "amenity/fast_food". --- Jura 10:08, 20 January 2020 (UTC)
- Comment Given the further discussion, I've removed the ready tag. I think things will be clearer once they make a URL format which can accept these IDs. --99of9 (talk) 00:58, 21 January 2020 (UTC)
- Comment The
id=
query parameter has been implemented, so I've updated the proposal to include a working formatter URL. I've also updated the regular expression to allow for slightly longer keys and much longer names, since the project plans to include entries with thepublic_transport=*
key at some point, and theoretically a key-value pair can be up to 255 characters long. – Minh Nguyễn 💬 04:37, 27 April 2020 (UTC) - Support now that we have the regex. Arlo Barnes (talk) 17:23, 27 April 2020 (UTC)
- @Mxn, Pintoch, Arlo Barnes, Jheald, ArthurPSmith, Eihel: and @Adamant1, Vahurzpu, 99of9, Jura1: --Tinker Bell ★ ♥ 01:37, 23 May 2020 (UTC)
- Support I assume that's what I'm suppose to do now that the URL thing has been implemented. If that's not what I was pinged for though, just let me know and I'll retract it/do whatever else I'm suppose to. --Adamant1 (talk) 05:40, 23 May 2020 (UTC)
- Comment Now that it's implemented I noticed it has an instance of retail chain requirement, but there's entries in the NSI that are not retail chains like banks. Which are instance of financial institution. There's also commercial companies like insurance agencies in the index. So can the requirement be expanded to incorporate none retail chain business instances somehow? --Adamant1 (talk) 08:38, 27 May 2020 (UTC)
- Support, hope that NSI can be benifical to Wikidata also*angys* (talk) 14:44, 28 May 2020 (UTC)