Wikidata:Property proposal/New York Times short URL
New York Times article ID edit
Originally proposed at Wikidata:Property proposal/Authority control
Not done
Description | short URL for a New York Times article |
---|---|
Data type | External identifier |
Allowed values | regex [a-zA-Z0-9]+ |
Example 1 | I Am Part of the Resistance Inside the Trump Administration (Q56488792) → 2CyF3Jh |
Example 2 | Aziz Ansari Is Guilty. Of Not Being a Mind Reader. (Q48342255) → 2EIbKzZ |
Example 3 | The Slut-Shaming of Nikki Haley (Q48344584) → 2FtInln |
Example 4 | Barack Obama and Me (Q58450190) → 2hJcqMP |
Source | HTML of nytimes.com |
Expected completeness | always incomplete (Q21873886) |
Formatter URL | https://nyti.ms/$1 |
See also | New York Times topic ID (P3221) |
Motivation edit
The New York Times is, by most measures, one of the most important newspapers in the world.
While online NYT articles seem to have at least two internal identifiers other than the long URL (the op-ed also has QXJ0aWNsZTpueXQ6Ly9hcnRpY2xlLzM4MGM0MGZhLWU5ZGYtNTg3Mi05NTcxLWUzMmUyZDBjNjYxMw==.legacy
), this one seems to be the most useful for Wikidata to record since it forms a working URL.
Presumably this property would be used on items about NYT articles themselves (or within references), and the items would be notable as a result of being used as sources on other items. Jc86035 (talk) 14:52, 22 October 2018 (UTC)
Discussion edit
- Comment About how many NY Times articles have wikidata items? ArthurPSmith (talk) 18:09, 22 October 2018 (UTC)
- 2235 according to Query: CLAIM[1433:9684] = SELECT ?item WHERE { ?item wdt:P1433 wd:Q9684 }. Visite fortuitement prolongée (talk) 19:20, 22 October 2018 (UTC)
- Oppose per w:en:URL shortening#Shortcomings and meta:Spam blacklist#URL shorteners. Visite fortuitement prolongée (talk) 19:20, 22 October 2018 (UTC)
- @Visite fortuitement prolongée, Pigsonthewing: Most of the concerns in the article don't actually apply, since nyti.ms isn't a public URL shortening service and presumably nyti.ms URLs are only generated by the New York Times for its own articles; most numeric and alphanumeric identifier systems obscure their subjects; .ms is the TLD for Montserrat, a British overseas territory (censorship unlikely); Wikidata can choose not to block the domain (and it doesn't); and the domain is registered under the New York Times Company. I would only be worried about the durability of the identifiers, but that is a concern for basically any URL. The domain is apparently run by bit.ly (or at least it was in 2009), but it has also lasted more than nine years so far and the domain is owned by the NYT itself. I don't think it's that different to other external IDs in Wikidata. Jc86035 (talk) 15:49, 23 October 2018 (UTC)
- Oppose per Visite fortuitement prolongée. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:53, 22 October 2018 (UTC)
- Support Well, seems fine to me, especially to link to over 2000 items. ArthurPSmith (talk) 18:41, 24 October 2018 (UTC)
- Support. Thierry Caro (talk) 10:13, 25 October 2018 (UTC)
- Comment I did not know that the New York Times was based in Montserrat. Visite fortuitement prolongée (talk) 14:59, 25 October 2018 (UTC)
- @Visite fortuitement prolongée: I meant that the domain had been registered with the TLD/country code for Montserrat, and it seems overseas companies are generally allowed to register .ms domains. Jc86035 (talk) 07:56, 26 October 2018 (UTC)
- I was not replying to a comment of you. Visite fortuitement prolongée (talk) 18:47, 26 October 2018 (UTC)
- Question Why create a property for shortened URL and not use URL (P2699) or full work available at URL (P953)? Visite fortuitement prolongée (talk) 14:59, 25 October 2018 (UTC), 15:01, 25 October 2018 (UTC)
- @Visite fortuitement prolongée: It really depends on whether we think this is worth storing. The short URLs are potentially more durable than the longer URLs, and it's possible that either set of links could be broken in the future. Other news sites might assign their articles a numerical ID and stick a bunch of keywords into the URL for SEO, so that links with other text before/after the number redirect correctly to the article; in those cases it might also be beneficial to store the ID separately. There are definitely more than 200 NYT articles so it seems reasonable to create a property for it, particularly since NYT articles may be published elsewhere (e.g. articles with PubMed identifiers; syndicated articles by news agencies). Jc86035 (talk) 07:56, 26 October 2018 (UTC)
- Surely the URL property can take more than one value, so I am unclear why this couldn't be added there. Or, if it is useful to distinguish, should there be a general property for an item's short url, if that's the issue? Dominic (talk) 22:19, 30 November 2018 (UTC)
- @Visite fortuitement prolongée: It really depends on whether we think this is worth storing. The short URLs are potentially more durable than the longer URLs, and it's possible that either set of links could be broken in the future. Other news sites might assign their articles a numerical ID and stick a bunch of keywords into the URL for SEO, so that links with other text before/after the number redirect correctly to the article; in those cases it might also be beneficial to store the ID separately. There are definitely more than 200 NYT articles so it seems reasonable to create a property for it, particularly since NYT articles may be published elsewhere (e.g. articles with PubMed identifiers; syndicated articles by news agencies). Jc86035 (talk) 07:56, 26 October 2018 (UTC)
- Comment P6085 (P6085), the equivalent to this property for The Guardian (Q11148), has been created. Jc86035 (talk) 15:05, 8 November 2018 (UTC)
- Does NYT refer to this anywhere as an identifier, or is it purely a string found in the URLs and we are inferring it is used as an identifier? Is there evidence this string unlikely to change or go away in the same way as unique identifier? At the very least, I am uncomfortable inventing a name for an identifier that is not used as such in the real world, and would call this "New York Times short URL code" or something like that. Dominic (talk) 17:08, 30 November 2018 (UTC)
- @Dominic: As with P6085 (P6085), if both the short URLs and long URLs work correctly, I would think that they are equally valid as identifiers. The 2009 article announcing the launch of the short URLs doesn't refer to them as stable identifiers, although if the links continue to function then they will obviously remain unique. Jc86035 (talk) 18:02, 30 November 2018 (UTC)
- Sure, I get that. I think there is a difference in meaning between a randomly generated code used in generating short URLs and an identifier in the authority control sense (which is how this seems to be proposed). Technically, every web property's URLs are unique, since that's how the web operates. But that does not make them identifiers. Dominic (talk) 22:19, 30 November 2018 (UTC)
- @Dominic: Is there much of a technical distinction between this and, say, Amazon Standard Identification Number (P5749) or YouTube video ID (P1651), other than that the others don't redirect? Almost all numerical/hexadecimal identifiers on Wikidata are either randomly or chronologically assigned; and the NYT servers that operate the short URLs could arguably be called (and would probably have to contain) a database. Jc86035 (talk) 17:44, 2 December 2018 (UTC)
- Sure, I get that. I think there is a difference in meaning between a randomly generated code used in generating short URLs and an identifier in the authority control sense (which is how this seems to be proposed). Technically, every web property's URLs are unique, since that's how the web operates. But that does not make them identifiers. Dominic (talk) 22:19, 30 November 2018 (UTC)
- @Dominic: As with P6085 (P6085), if both the short URLs and long URLs work correctly, I would think that they are equally valid as identifiers. The 2009 article announcing the launch of the short URLs doesn't refer to them as stable identifiers, although if the links continue to function then they will obviously remain unique. Jc86035 (talk) 18:02, 30 November 2018 (UTC)
- Oppose Url's are good enough for the purpose and I don't see the need to add article ID urls for all sorts of newswebsites. ChristianKl ❪✉❫ 13:50, 10 December 2018 (UTC)
- @ChristianKl: ... so should P6085 (P6085) be proposed for deletion? Jc86035 (talk) 14:26, 10 December 2018 (UTC)
- I don't think that property should exist either. ChristianKl ❪✉❫ 14:34, 10 December 2018 (UTC)
- @ChristianKl: ... so should P6085 (P6085) be proposed for deletion? Jc86035 (talk) 14:26, 10 December 2018 (UTC)