Topic on User talk:Magnus Manske

Problem with reference format added by Reinheitsgebot

11
Jura1 (talkcontribs)

Hi Magnus,

At project chat, it was noted that the format of some of the references is suboptimal. Apparently User:Vojtěch_Dostál attempted to contact you about it before.

Personally, I noticed the following problems:

This seems to be a regression from the way it was done before. As I consider this a bug, I will ask for a block of User:Reinheitsgebot until it's resolved for new edits. Otherwise more cleanup will pile-up.

If Reinheitsgebot uses an identifier, e.g. FAST ID (P2163), the statement with applicable 'stated in' value (P9073) can be used to determine the stated in (P248) value to be used in the reference (at Property:P2163#P9073 for that property).

Also, if the P2163 and P248 are there, reference URL (P854) isn't needed.

Magnus Manske (talkcontribs)

Hi! The main issue appears to be "multiple references in one". This is limitation of QuickStatements; all references given in a single "command row" will go into one reference, but if given as multiple rows, each row will create a new statement. I will have to alter the QuickStatements syntax to allow for multiple references in the same row. This will take time; I have deactivated the recently (1-2 weeks?) introduced mass creation script. There will be occasional item creations with this issue (from Mix'n'match), but not in high volume. I think the usefulness of the bot will outweigh the occasional duplicate reference, for the time being.

As for the "wikidata-external-url", I am using the Wikidata "formatter URL" of this property. I believe that's the correct thing to do?

I will look into the other edge cases you mentioned.

Jura1 (talkcontribs)

Hi Magnus, Thanks for looking into this. Occasionally creations shouldn't be much of an issue. In general, even the mass creations are useful, but it's better not to have to do clean-ups on the items after creation.

If there is a property, formatter url needn't be applied. "stated in" and the property directly is sufficient. See Help:Sources#Databases. Sample edit for the item above: . This has the advantage that when the formatter url changes, the link is updated.

"wikidata-external-url"-redirector is being used when Wikibase doesn't allow to convert them correctly. Somehow WMDE doesn't want to add a few lines of code to do it for these.

BTW, the "reference URL" qualifier on identifiers (e.g. at ) shouldn't be there either. The identifier itself should already link and I don't think that would be a reference for the statement.

Magnus Manske (talkcontribs)

As part of the fix for this, I just launched a "reference fixed" on ~10K items (initially). This will

  • break up multiple "reference URL" values into individual "groups"
  • replace reference URLs with property/value(/stated_in) snaks

Example item.

Jura1 (talkcontribs)

Looks good. For one reference url, it seems this was missing on the property level.

Magnus Manske (talkcontribs)

It should treat http and https the same. Also, as it uses SPARQL to get the formatter URLs, deprecated might not show up there?

Jura1 (talkcontribs)

I don't know how your tool works. Old formatters need to use deprecated rank otherwise they end up being used by Wikibase (somehow preferred rank isn't sufficient). I suppose you know that you can get deprecated rank by using p:P1630/ps:P1630 .

Maybe URL match pattern (P8966) can help for the conversion?

Magnus Manske (talkcontribs)

Clarification:

  • This is resolving URLs to properties via the "formatter URL"
  • I am not treating wikidata-externalid-url.toolforge.org ones any differently here
  • I am not adding a "retrieved" date, as I do not know when the information was originally imported into Mix'n'match
Jura1 (talkcontribs)
Magnus Manske (talkcontribs)
LiberatorG (talkcontribs)

Hi, I posted on User talk:Reinheitsgebot#Invalid_username_references, but then I saw that you preferred to discuss the bot here on your own talk page.

Reinheitsgebot is changing URL references into invalid references to a username, as in . However a URL path is not a valid username. Even if the current formatter URL will resolve it to the original URL, that could change in the future if there is a better way to reference the user specifically. Here it is using GitHub username (P2037) but the bot also creates invalid references using other username properties such as Twitter username (P2002). Could you help to fix these invalid references and stop the bot from recreating them? Thanks!

Reply to "Problem with reference format added by Reinheitsgebot"