User:Bovlb/Don't underestimate ignorance

(Redirected from User:Bovlb/DUI)
Never attribute to malice that which is adequately explained by ignorance.
—a misquotation of Hanlon's razor

There are a lot of things about the way Wikidata works that are not automatically obvious to newcomers. Because of this, when they try to get something done, they often make mistakes. These mistakes can be hard to distinguish from vandalism so, when reverting newcomer edits, it is important to think hard about whether there may be a benign interpretation. It's hard to reform genuine vandals, but it is much easier to help newcomers overcome honest mistakes. Once you start labelling them as vandals, however, it get progressively harder. Only if you can identify the nature of the error can you guide them out of it. This essay is offered as a set of possible good faith explanations for apparent vandalism. My hope is that you will find some of these ideas useful in working with new users.

Often the user is attempting to do something beneficial for the project, but they don't know how to do it right, so instead they do it wrong. Typically it is easy for us to just revert their mistakes, but we should not lose sight of three things:

  1. They are trying to help
  2. They have invested their time and effort
  3. Whatever it is they were trying to achieve might be worth doing

Remember that a user who has worked hard only to see it all reverted may experience frustration and may not respond well to your intervention. This may lead to edit warring and block evasion that in turn escalate the response to them.

Would this be better as an FAQ on common newbie mistakes? I feel like this document is attempting to serve both purposes. Should some of these sections be rewritten as specialised user warning messages?

Blanking edit

Blanking is when a user removes many labels, descriptions, sitelinks and/or claims from an item, often leaving the item empty or nearly empty. This can be vandalism but, although blanking is never the right thing to do, there are a number of good-faith reasons that users attempt it:

  • Duplication: When we find duplicates, they ought to be merged, not blanked.
  • Notability: When we find non-notable items, they ought to be deleted, not blanked.
  • Privacy: A common case is that a borderline notable person is distressed to find themselves listed here and wants to have their item removed. Of course subjects don't get to control whether Wikidata considers them to be notable, but we can still be sensitive to their concerns (see Living people). Such users often phrase their concerns as a complaint about Google results.
  • Formatting: Another reason that new editors remove properties is because they are attempting to tweak the formatting of infoboxes or data tables on some client project. They might delete non-current values or remove the same property from many items. Of course, it is never correct to remove true (or sourced) claims from Wikidata in order to filter its presentation elsewhere; the answer may be to use ranking, or it may that they need to make changes on the client project specific to the module they are using.

There is a specific form of notability blanking that is sometimes seen from users that have correctly filed a request for deletion. In anticipation of that deletion, they start blanking the item, especially sitelinks and incoming links from other items. Sometimes this flies under the radar and an admin will delete the item without realising that, until recently, it met the notability criteria. Again, it's possible that this is nefarious and deceitful, but many new editors apparently believe that such blanking is somehow helpful to the deletion process. Similarly, editors often blank items when its sitelink has been deleted or listed for deletion on a client project.

Another odd pattern of deletion or overwriting occurs with official websites, social media accounts, and spouses. New users often see Wikidata as a directory of current information. If a website goes dark or a social media account is closed, they want to delete or replace the corresponding claim. If a celebrity divorces and remarries, the new user (and, presumably, the celebrity) wants to replace the old spouse with the new one. The right approach, of course, is to have multiple values, perhaps with start and end time, and perhaps with the new value preferred or the old one deprecated (see Ranking).

One more pattern of deletion is the removal of sitelinks that are redirects. One explanation for this is that new users are confused about the way we handle redirect links on client projects, and believe that redirects should be deleted on sight.

Repurposing edit

Repurposing is when a user edits an item about one concept in order to make it about a different concept. There are a number of good faith reasons that new users do this:

  • Conflation: New users are often unfamiliar with the idea of ontologies and identified concepts. If they encounter an item about a John Smith they consider to be obscure, they want to rewrite it to be about their favourite John Smith.
  • Scripts: There is at least one script that allows a user to create a new item by duplicating an existing one. This is narrowly useful when faced with the task of splitting one (conflated) item into two (e.g. a death event and the victim). Sometimes new users stumble upon this script and decide it is the best way to create arbitrary new items, leading to apparent (and often incomplete) repurposing.
  • Limits: Sometimes new users can be very confused about how we do things here. For example, at least one new user has failed to find the "Create a new item" link and concluded that we have a limited number of Q-ids, and that the best way to create a new item is to find an existing one that seems non-notable and repurpose it. As you can imagine, this can lead to some chaotic results.
  • Examples: We encourage new users to use existing items as examples of how to create a new one. Some new users will repurpose an existing item multiple times, apparently under the impression that they are creating new items.

Odd labels, descriptions, and aliases edit

One odd behaviour we see again and again is that a new user will change the label on some commonly used item to an incompatible peer value. For example, they may change the label on green (Q17122854) to "grey", or on Alex (Q13258171) to "Alan". This might seem as if it couldn't be anything other than vandalism, but I believe it commonly arises from a new user not understanding how to change the object value on some item. If some celebrity is recorded as having green eyes but the new user wants to change this to grey, many of them seem to believe that changing the label on green (Q17122854) is the right way to achieve this, ignorant of this effect this will have on all the other green-eyed people.

Another common error is to add or change a label or description in the wrong language. Sometimes this is vandalism, but often users just don't seem to realise that these fields are language-specific, and they edit the first entry they see.

Our conventions for labels and descriptions is that they should be in mid-sentence case. New users will often capitalize labels and descriptions. Worse, some will work their way through valid labels, giving them inappropriate capitalization. What might not be obvious is that some client projects (e.g. Commons) make substantial use of Wikidata labels and, depending on context, they will look wrong if they start with a lowercase letter. The correct solution is to use the capitalization=ucfirst flag on the client project, but new users are not likely to stumble on that solution. Another common error is to add periods to the end of labels or descriptions; presumably this is also intended to improve formatting on a client project.

Some new users will create or change labels to include parenthetical disambiguators. Commonly this comes from copying the sitelink. New users see the importance that client projects like Wikipedia place on such disambiguators and don't realise that Wikidata uses a separate description field for that purpose.

Some new users will get utterly confused between the label, description, and alias fields. Some put extra descriptions in the alias field, or put labels in the description field. I'm guessing that this is because the field names are unclear in some language. Sometimes users will simply type the name of the language into one of the fields; presumably thinking they are setting the language for the other fields.

Some new users appear to think that they can use the description field to send messages to the entity described.

A very common problem edit is where a contiguous sequence of characters are deleted from the middle or end of a label or description (or sometimes other text fields). My guess is that some Wikidata editing tool (e.g. mobile) is making it easy to do this inadvertently.

Multiple accounts edit

Sockpuppetry is when a user makes multiple accounts for a nefarious purpose like vote stacking (giving the impression that a position has more support than it really does), evasion of scrutiny, or block evasion. This can be a big problem and is strictly forbidden here at Wikidata. A related activity is meat puppetry, when a user gets their friends to sign up and support some position.

Some users will create a series of accounts (and edit logged out) without intending to do anything nefarious. This may be hard to distinguish from sockpuppetry, but the clue is that they often make no attempt to hide their continuing identity, making the same edits and contributing to the same discussions as if nothing had happened, and often vaguely acknowledging previous accounts.

There are several possible good faith reasons why these users create and use many accounts:

  • Lost passwords: Setting a recovery email is optional, and many users either don't take advantage of it or mistype their own email address.
  • Confusion about blocking: Strange as it may seem to us old hands, many new users just don't understand what is going on when they are blocked and can no longer edit "the encyclopedia anyone can edit". In particular, they do not appreciate our strict rules about how blocks are intended to apply to the editor and not the account. Many new users apparently believe that the best way to proceed is to create a new account "just to get the old one unblocked" and then post their sob story in various places, making no attempt to hide that they are evading their block. This usually results in sockpuppet investigations galore, an indefinite block on all their accounts, and great difficulty in successfully arguing for subsequent unblock.
  • Client project policies: Some client projects (e.g. English Wikipedia) have strict rules on user names; for example, user names are not permitted to suggest that the editor represents a company. Wikidata has no such rules but nevertheless experiences some of the fallout of such policies. Many users just abandon their first account and create a new one. To make things worse, the first account name was probably open about the editor's conflict of interest, but these policies caused the editor to hide this with the new account.
  • IP hopping: For some users editing logged out, they will continually change their IP address (especially for IPv6), and this is beyond their control. No feedback on edits or notifications of reverts will reach them.

Recreation edit

Another thing new users don't understand about Wikidata is how our notability policies work. (Could you have predicted what they would be before you encountered them?) Some new users believe that Wikidata is just a free-for-all directory like LinkedIn or Facebook, where everyone is welcome to post their own entry. They will create an item on themselves (or some other topic they are heavily invested in), but the result will often fail to communicate any evidence that the concept passes our notability criteria. The item will then be deleted.

Users are typically not informed that their creation has been destroyed. (We don't notify creators about requests for deletion. We don't have a user warning message for "I deleted your item".) For technical reasons, it is hard for them to find out what happened to their item. (On a Wikipedia project, they can search under the concept label and possibly find their way to a deletion log, but not on Wikidata. For non-admins, it is pretty much impossible to find a user's deleted items.) It is simply gone.

It's not really surprising then, that a new user may scratch their head and start again, creating a new item. Again and again and again. Until we block them for spamming.

Edit warring edit

It is obvious to (most of) us old hands that edit warring is bad for the project and typically does not end well, but it is something that many new users (and some not so new users that should know better) reach for first on finding themselves reverted. We know about bold-revert-discuss and the various ways to seek consensus, but all this isn't obvious to new users. Wikidata feels similar to many user-editable websites that do not have our background activity of collegiate collaboration. New users also feel a sense of ownership over their created items (and, worse, over items denoting themselves) and don't understand why they don't get to just have it the way they want.

See Wikidata:Edit warring, but also try opening a discussion (on the item's discussion page, the user's talk page, or on Wikidata:Project chat). If restoring claims that might be contested, try improving their references. Above all, don't repeat yourself!

Non-responsive edit

New users will make a lot of mistakes. If handled correctly, they can be given feedback about those mistakes and shown how to improve. There is a class of users who apparently never respond to any feedback (either by engaging in discussion or by changing their editing practices). Such editors often end up with an indefinite block, despite the fact that they were clearly well-intentioned and (perhaps) many of their edits were positive contributions.

It's very hard to say why users who won't communicate won't communicate, but here are a few ideas gleaned from the handful of cases where I have seen the communication barrier break:

  • Ignoring messages: Some users simply don't notice the message indicator. They just don't realise that these are collaborative projects or that they are expected to read and respond to messages from other users.
  • Language: Most user warning messages are written in English. Many users cannot read English, but that should not prevent them from participating in this project. If I get a message in a language I cannot read, I immediately reach for Google Translate, which works well in the vast majority of cases, allowing dual language conversations to proceed. From observation, many users will incuriously ignore messages if they cannot read the language. Sometimes the solution is to guess the user's language and find someone who can write a message they will understand.
  • Vagueness: A lot of the time, feedback given to users is unactionably vague. The classic is I undid one or more of your recent contributions because it didn't appear constructive. That message, while tactfully phrased, tacitly assumes that the user was being deliberately naughty and is going to know what they did wrong. If used for good faith mistakes and genuine confusion, it may fail to dispell that confusion in many cases. Of course, if we received such a message, we would follow up and request clarification. Note that, even if a new user does reply, they are likely to do so on their own talk page and will not think to use {{Ping}}, so the user who supplied the warning will never see the question.
  • Timing: Don't expect a new user to see your message instantly, understand it readily, and change their behaviour radically. These things can take time.

Other edit

A common problem edit is when instance of (P31) human (Q5) is replaced with something like human settlement (Q486972), Humano (Q76717332), or Human Being (Q2628591). My guess is that the editor was unsatisfied with the presentation of an infobox on some client project and sought to improve it here, not realising the difference between strings and ontological concepts. Possibly these items have more plausible labels in some other language.

Another common pattern with new users is that they will create (sometimes many) new items that are just label and description. The label and description are vague general terms. I suspect that such users are attempting to search for existing items and somehow coming across "Create a new item" first instead.