Open main menu

Wikidata:Property proposal/New York Times short URL

< Wikidata:Property proposal

New York Times article IDEdit

Return to Wikidata:Property proposal/Authority control

   Under discussion
Descriptionshort URL for a New York Times article
Data typeExternal identifier
Allowed valuesregex [a-zA-Z0-9]+
Example 1I Am Part of the Resistance Inside the Trump Administration (Q56488792)2CyF3Jh
Example 2Aziz Ansari Is Guilty. Of Not Being a Mind Reader. (Q48342255)2EIbKzZ
Example 3The Slut-Shaming of Nikki Haley (Q48344584)2FtInln
Example 4Barack Obama and Me (Q58450190)2hJcqMP
SourceHTML of nytimes.com
Expected completenessalways incomplete (Q21873886)
Formatter URLhttps://nyti.ms/$1
See alsoNYT topic ID (P3221)

MotivationEdit

The New York Times is, by most measures, one of the most important newspapers in the world.

While online NYT articles seem to have at least two internal identifiers other than the long URL (the op-ed also has QXJ0aWNsZTpueXQ6Ly9hcnRpY2xlLzM4MGM0MGZhLWU5ZGYtNTg3Mi05NTcxLWUzMmUyZDBjNjYxMw==.legacy), this one seems to be the most useful for Wikidata to record since it forms a working URL.

Presumably this property would be used on items about NYT articles themselves (or within references), and the items would be notable as a result of being used as sources on other items. Jc86035 (talk) 14:52, 22 October 2018 (UTC)

DiscussionEdit

  •   Comment About how many NY Times articles have wikidata items? ArthurPSmith (talk) 18:09, 22 October 2018 (UTC)
  •   Oppose per w:en:URL shortening#Shortcomings and meta:Spam blacklist#URL shorteners. Visite fortuitement prolongée (talk) 19:20, 22 October 2018 (UTC)
    @Visite fortuitement prolongée, Pigsonthewing: Most of the concerns in the article don't actually apply, since nyti.ms isn't a public URL shortening service and presumably nyti.ms URLs are only generated by the New York Times for its own articles; most numeric and alphanumeric identifier systems obscure their subjects; .ms is the TLD for Montserrat, a British overseas territory (censorship unlikely); Wikidata can choose not to block the domain (and it doesn't); and the domain is registered under the New York Times Company. I would only be worried about the durability of the identifiers, but that is a concern for basically any URL. The domain is apparently run by bit.ly (or at least it was in 2009), but it has also lasted more than nine years so far and the domain is owned by the NYT itself. I don't think it's that different to other external IDs in Wikidata. Jc86035 (talk) 15:49, 23 October 2018 (UTC)
  •   Oppose per Visite fortuitement prolongée. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:53, 22 October 2018 (UTC)
  •   Support Well, seems fine to me, especially to link to over 2000 items. ArthurPSmith (talk) 18:41, 24 October 2018 (UTC)
  •   Support. Thierry Caro (talk) 10:13, 25 October 2018 (UTC)
  •   Comment I did not know that the New York Times was based in Montserrat. Visite fortuitement prolongée (talk) 14:59, 25 October 2018 (UTC)
    @Visite fortuitement prolongée: I meant that the domain had been registered with the TLD/country code for Montserrat, and it seems overseas companies are generally allowed to register .ms domains. Jc86035 (talk) 07:56, 26 October 2018 (UTC)
  •   Question Why create a property for shortened URL and not use URL (P2699) or full work available at (P953)? Visite fortuitement prolongée (talk) 14:59, 25 October 2018 (UTC), 15:01, 25 October 2018 (UTC)
    @Visite fortuitement prolongée: It really depends on whether we think this is worth storing. The short URLs are potentially more durable than the longer URLs, and it's possible that either set of links could be broken in the future. Other news sites might assign their articles a numerical ID and stick a bunch of keywords into the URL for SEO, so that links with other text before/after the number redirect correctly to the article; in those cases it might also be beneficial to store the ID separately. There are definitely more than 200 NYT articles so it seems reasonable to create a property for it, particularly since NYT articles may be published elsewhere (e.g. articles with PubMed identifiers; syndicated articles by news agencies). Jc86035 (talk) 07:56, 26 October 2018 (UTC)
    Surely the URL property can take more than one value, so I am unclear why this couldn't be added there. Or, if it is useful to distinguish, should there be a general property for an item's short url, if that's the issue? Dominic (talk) 22:19, 30 November 2018 (UTC)
  •   Comment The Guardian article ID (P6085), the equivalent to this property for The Guardian (Q11148), has been created. Jc86035 (talk) 15:05, 8 November 2018 (UTC)
  • Does NYT refer to this anywhere as an identifier, or is it purely a string found in the URLs and we are inferring it is used as an identifier? Is there evidence this string unlikely to change or go away in the same way as unique identifier? At the very least, I am uncomfortable inventing a name for an identifier that is not used as such in the real world, and would call this "New York Times short URL code" or something like that. Dominic (talk) 17:08, 30 November 2018 (UTC)
    @Dominic: As with The Guardian article ID (P6085), if both the short URLs and long URLs work correctly, I would think that they are equally valid as identifiers. The 2009 article announcing the launch of the short URLs doesn't refer to them as stable identifiers, although if the links continue to function then they will obviously remain unique. Jc86035 (talk) 18:02, 30 November 2018 (UTC)
    Sure, I get that. I think there is a difference in meaning between a randomly generated code used in generating short URLs and an identifier in the authority control sense (which is how this seems to be proposed). Technically, every web property's URLs are unique, since that's how the web operates. But that does not make them identifiers. Dominic (talk) 22:19, 30 November 2018 (UTC)
    @Dominic: Is there much of a technical distinction between this and, say, Amazon Standard Identification Number (P5749) or YouTube video ID (P1651), other than that the others don't redirect? Almost all numerical/hexadecimal identifiers on Wikidata are either randomly or chronologically assigned; and the NYT servers that operate the short URLs could arguably be called (and would probably have to contain) a database. Jc86035 (talk) 17:44, 2 December 2018 (UTC)
  •   Oppose Url's are good enough for the purpose and I don't see the need to add article ID urls for all sorts of newswebsites. ChristianKl❫ 13:50, 10 December 2018 (UTC)
    @ChristianKl: ... so should The Guardian article ID (P6085) be proposed for deletion? Jc86035 (talk) 14:26, 10 December 2018 (UTC)
    I don't think that property should exist either. ChristianKl❫ 14:34, 10 December 2018 (UTC)