Wikidata:Property proposal/waterbody ID (Germany)

LAWA waterbody ID

edit

Originally proposed at Wikidata:Property proposal/Place

Descriptionidentifier for bodies of water in Germany according to the Bund/Länder-Arbeitsgemeinschaft Wasser (LAWA)
RepresentsGewässerkennzahl (Q1428658)
Data typeExternal identifier
Template parameterGKZ in de:Vorlage:Infobox Fluss
Domainbodies of water
Allowed values[1-69][0-9]{0,9}
ExampleRhine (Q584) → 2, Heusiepen (Q1616544) → 27366462
Sourcevarious sources, e.g. http://www.lfu.bayern.de/wasser/gewaesserverzeichnisse/doc/tab_alle.xls, http://www.lanuv.nrw.de/fileadmin/lanuv/wasser/pdf/Gewaesserverzeichnis%20GSK3C.xls
See alsoSandre river ID (P1717) (for rivers which flow through both France and Germany), Wikidata:Property proposal/river code in Romania
Motivation

We currently have Gewässerkennzahl (P1183) based on the GKZ parameter of de:Vorlage:Infobox Fluss. As I originally wrote in Property talk:P1183#Is_this_really_a_single_thing.3F, I can find no evidence of this being a single identifier. The format used appears to be an invention by the German Wikipedia to allow multiple unrelated identifiers from different countries to share the same infobox parameter. We already have country-specific properties for some of the IDs. Other countries appear to have more than one set of IDs. I think the right thing to do is to identify the individual systems and split P1183 into separate properties for each one. If people agree that this is the right approach, I'll create more proposals for the other systems I've been able to identify.

This proposal is for the system in Germany described at de:Gewässerkennzahl (Deutschland). I haven't been able to find any central source for these, only various PDFs or spreadsheets from individual states, but the Wikipedia page says it's a country-wide system.

The corresponding tag in OSM is ref:fgkz (while we can't import directly from OSM, maybe we can still use it to find missing OpenStreetMap relation ID (P402) statements).

- Nikki (talk) 12:50, 26 October 2016 (UTC)[reply]

Discussion
  •   Support ChristianKl (talk) 17:16, 6 November 2016 (UTC)[reply]
  •   Weak oppose I don't see the point - you haven't identified a formatter URL that can be used to link these so this would be an external id without a link. And Gewässerkennzahl (P1183) uses a DE/ prefix for at least the German numbers, it seems, so there shouldn't be ambiguities here. Have you talked to the German wikipedia people about this? ArthurPSmith (talk) 21:53, 7 November 2016 (UTC)[reply]
    Responding to the different points:
    • Formatter URLs are useful but they're not required and it's not uncommon to not have one (roughly every 1 in 6 of the external identifier properties does not have one).
    • The point is that we're storing multiple unrelated identifiers as a single property using an invented format to (try to) make them unambiguous and I can't see how that can possibly be a desirable situation. "DE/" is not part of the identifier and if we corrected the identifiers, they would be ambiguous. If P1183 were being proposed today, we would almost certainly reject it because it's not a single identifier, it's not even a coherent set of identifiers, it's just a dumping ground for any identifiers which haven't got their own property and that's not a good approach to structured data.
    • I haven't talked to the German Wikipedia people about it... I'm not sure what you think I should say to them. They don't use our data for this, so it doesn't affect them and even if they did want to use it, they would already need to combine multiple properties together because we don't use P1183 for identifiers which have their own property. We also don't use the exact format the German Wikipedia uses, e.g. the GKZ parameter on de:Rhein is "CH/1/DE/2/FR/A---0000" (which means the Swiss code is "1", the German code is "2" and the French code (Sandre river ID (P1717)) is "A---0000"), but we don't enter "CH/1/DE/2/FR/A---0000" as the value for P1183.
    - Nikki (talk) 16:53, 8 November 2016 (UTC)[reply]
  • Ok, I've changed my oppose to a "weak" one. I don't see that this change really brings any benefit (for example just having 3 separate entries for this property for de:Rhein would seem fine to me and essentially match what the German wikipedia people are doing) but if this sort of thing has been done for other id's of this sort I guess I don't strongly object to making this change. ArthurPSmith (talk) 18:46, 14 November 2016 (UTC)[reply]
    • I think P1183 looks like a single identifier, but I don't think it is. In the meantime, I think the US identifiers have already been assigned to GNIS. FR and RU already have their own properties --- Jura 19:31, 8 November 2016 (UTC)[reply]
  • It's been a couple of months now with no further comments and I would appreciate some sort of resolution to this so that I know whether to continue work on these IDs or abandon it. I do still believe that the existing property is terrible and unfit for purpose because it lumps together a bunch of unidentified unrelated identifiers using an invented format not used anywhere else other than the German Wikipedia (and even then only in the page source), but it would be a waste of my time to do any more work on trying to identify and separate out the rest of them if we're not actually going to split it. - Nikki (talk) 14:52, 19 January 2017 (UTC)[reply]
  • Given the description on the German Wikipedia page this seems to be a well-defined property with the Bund/Länder-Arbeitsgemeinschaft Wasser as the authority for the definition of the number. ChristianKl (talk) 13:35, 15 April 2017 (UTC)[reply]
  •   Support --1-Byte (talk) 15:23, 15 April 2017 (UTC)[reply]
  • @1-Byte, ArthurPSmith, Nikki:   Done ChristianKl (talk) 13:12, 18 April 2017 (UTC)[reply]