Property talk:P9070

Latest comment: 3 years ago by Peter James in topic Backslash decoding problem

Documentation

Internet Encyclopedia of Ukraine ID
identifier for an article in the Internet Encyclopedia of Ukraine
Applicable "stated in" valueInternet Encyclopedia of Ukraine (Q87193076)
Data typeExternal identifier
Allowed values[A-Z]\\[A-Z]\\[A-Za-z0-9]+
ExampleKyiv (Q1899)K\Y\Kyiv
concentration camp (Q152081)C\O\Concentrationcamps
Pavlo Zahrebelnyi (Q983989)Z\A\ZahrebelnyPavlo
Sourcehttp://www.encyclopediaofukraine.com/main-index.asp
Formatter URLhttps://wikidata-externalid-url.toolforge.org/?p=9070&url_prefix=http://www.encyclopediaofukraine.com/display.asp?linkpath=pages\&id=$1
http://www.encyclopediaofukraine.com/display.asp?linkpath=pages\$1
Related to country Ukraine (Q212) (See 57 others)
Lists
Proposal discussionProposal discussion
Current uses
Total1,119
Main statement1,089 out of 8,136 (13% complete)97.3% of uses
Qualifier30.3% of uses
Reference272.4% of uses
Search for values
[create Create a translatable help page (preferably in English) for this property to be included here]
Distinct values: this property likely contains a value that is different from all other items. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P9070#Unique value, SPARQL (every item), SPARQL (by value)
Format “[A-Z]\\[A-Z]\\[A-Za-z0-9]+: value must be formatted using this pattern (PCRE syntax). (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P9070#Format, SPARQL
Single value: this property generally contains a single value. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P9070#Single value, SPARQL
Scope is as main value (Q54828448), as reference (Q54828450): the property must be used by specified way only (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P9070#Scope, SPARQL
Allowed entity types are Wikibase item (Q29934200): the property may only be used on a certain entity type (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P9070#Entity types
This property is being used by:

Please notify projects that use this property before big changes (renaming, deletion, merge with another property, etc.)

Discussion edit

Backslash decoding problem edit

There’s a problem with the parsing of ID’s to generate URLs. The formatter URL contains a backslash which does not get URL percent-encoded when the URL is generated. I presume this is a result of backslashes aren’t handled properly by simple regex replacement.

Here’s an example to illustrate the problem:

  1. The formatter URL is http://www.encyclopediaofukraine.com/display.asp?linkpath=pages\$1
  2. Wikidata item: history of Ukraine (Q210701) has the ID H\I\historyofukraine
  3. The generated link should be http://www.encyclopediaofukraine.com/display.asp?linkpath=pages\ + H\I\historyofukraine + .htm
  4. And it should be URL-encoded and sent as http://www.encyclopediaofukraine.com/display.asp?linkpath=pages%5CH%5CI%5CHistoryofUkraine.htm, with every backslash <\> changed to <%5C>

But what actually gets sent is http://www.encyclopediaofukraine.com/display.asp?linkpath=pages\H%5CI%5Chistoryofukraine with the first backslash in literal text. This happens to work for me, I presume because the server happens to accept it. But it is unintended, random and messy, and I believe it is technically a malformed URL.

I propose correcting this by broadening the definition of the ID so that all backslashes will get processed. Maybe it’s neater and safer to include the entire unit of the linkpath parameter, for example, pages\H\I\historyofukraine.

Unfortunately, this requires changing the ID for over 300 items. I am glad to make the update and do the work. Is there a best sequence for the changeover?

Please let me know if this sounds good. —Michael Z. 20:41, 17 February 2021 (UTC)Reply

The encoding is being worked on at the moment, see phabricator:T271126. I'm not sure if it will change anything for backslashes. Ghouston (talk) 21:14, 17 February 2021 (UTC)Reply
I think it probably will, since the "rawurlencode" PHP function does seem to encode "all non-alphanumeric characters except -_.~": https://www.php.net/manual/en/function.rawurlencode.php. Ghouston (talk) 21:18, 17 February 2021 (UTC)Reply
Could the backslash be changed in the formatter? There seems to be nowhere to test, as Sandbox-External identifier (P2536) is stuck with a cached formatter, and formatter URLs are not converted into links at test.wikidata.org. Peter James (talk) 13:12, 18 February 2021 (UTC)Reply
Return to "P9070" page.