Wikidata talk:Notability/Archive 7

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.


Criterion 2 for notability seems overly broad

Criterion 2 currently ends: The entity must be notable, in the sense that it can be described using serious and publicly available references. We have a number of companies using Wikidata to advertise their businesses. See Special:WhatLinksHere/Q62849941 for those which use local business (Q62849941). Their websites are serious and publicly available. Serious sources, as the term serious is usually used in English, means almost anything that exists can have a Wikidata entry.

There is no page here equivalent to en:Wikipedia:Reliable sources. The closest is Wikidata:Verifiability#Authoritative sources, which requires "free of bias". Can we propose changing the last part of criterion 2: "described using serious authoritative and publicly available references."? StarryGrandma (talk) 22:07, 30 March 2022 (UTC)

Practical impression from the deletion departement: many admins usually do not consider sources that are directly (own website, social media, etc.) or indirectly (e.g. obviously promotional content) under control of the subject itself as "serious" in terms of notability. In other words: if this is the only source, it is usually not sufficient for inclusion. —MisterSynergy (talk) 22:31, 30 March 2022 (UTC)
I don't think the issue is on the deletion side (except that these don't come up for deletion). It is on the incoming side. It would be nice if the wording of the policy matched the application of the policy. StarryGrandma (talk) 01:16, 31 March 2022 (UTC)
Sure, I did not mean to oppose potential changes to the wording of the policy. —MisterSynergy (talk) 05:41, 31 March 2022 (UTC)

OPPOSE - this is not wikipedia, the purpose of wikidata is to serve as an organisational database. inclusion criteria & notability are not intended to mirror wikipedia's policies on same. inclusion criteria on here are supposed to be broad; the purpose of wd is (very) basically to compile a huge, well organized, & comprehensive cross-indexed database file of subjects, for other wiki-projects (& outside end users) to make use of; for sorting, cross linking, etc., etc., etc.

(& anyone seriously "using wikidata to advertise their business" is out of their mind; what exactly are the supposed "promotional benefits" of having a wikidata item on here? are there millions of avid end-users roaming wikidata, that i do not know about....? did we become cool & popular with the general wiki-using public, without knowing it, because i really don't see that happening, tbh. just being listed on here has pretty minimal "promotional value" as far as i can tell. it's about equivalent to getting an index-card in a library's dewey decimal system file-boxes, in old-school terms. i would be fascinated if anyone can offer examples to the contrary?) xD Lx 121 (talk) 08:46, 11 April 2022 (UTC)

You do know that data from Wikidata is being used in a number of ways, especially for Google’s Knowledge Graph? --Emu (talk) 15:47, 21 April 2022 (UTC)

Extend criterium number 3 for notability

I would propose to extend notability criterium number 3, because I was involved into this issue recently...:

Current rule: It fulfills a structural need, for example: it is needed to make statements made in other items more useful.

Proposed change: It fulfills a structural need, for example: it is needed to make statements in other items possible, or it is required in other subsystems, e.g. to be backwards referred from a Commons Creator page (P1472) target page on Wikimedia Commons. Geertivp (talk) 21:33, 11 September 2022 (UTC)

I don't think that this extension is necessary. We do already consider all use in Wikimedia projects as "structural use" implicitly, except maybe for cases of clear vandalism.
That said, Q78188373 should not contain any unsourced data (beyond trivial claims such as P31, P735, P734, P1472) here at Wikidata if there is no sitelink attached. —MisterSynergy (talk) 21:42, 11 September 2022 (UTC)
Do we? #3 explicitly states the need “to make statements made in other items more useful” … --Emu (talk) 22:19, 11 September 2022 (UTC)
Yes we do. "Structural need" has never been formally defined. In practice we have long reached informal consensus that any legit use in a Wikimedia project constitutes "structural need".
As indicated above, verifiability needs to be possible for all content ("you can Google it yourself" is not sufficient). If some data in question has not been published elsewhere, it does not belong into Wikidata; if it has, a source can and should be added – particularly if there is no sitelink connected to the item. —MisterSynergy (talk) 22:30, 11 September 2022 (UTC)
The creator template you created is out of scope on Commons, see Commons:Commons:Creator#Scope. Adding this would create a circular dependence. Multichill (talk) 16:40, 12 September 2022 (UTC)

What parts of the Wikimedia Foundation are notable?

Wikidata:WikiProject Wikimedia Foundation

Bluerasberry (talk) 19:11, 6 November 2022 (UTC)

The answer is simple: If they conform to WD:N, they are notable. If you don’t like this situation, try to reach consensus to change this situation. --Emu (talk) 19:30, 6 November 2022 (UTC)

Websites used as sources on Wikipedia

I recently created entries on some websites used as sources on Wikipedia (Ex. Q115135103, Q115163952). They are widely used (thousands of times, at least, see bestref.net/pl/ ). I hope that the entries meet Wikidata notability, points 2 and/or 3? Please let me know if I am correct in creating entries for them. Piotrus (talk) 03:01, 11 November 2022 (UTC)

@Piotrus seems suspicious, no external serious source(s) have been given. Please don't create them en masse. It is possible that they don't meet WD:Notability Estopedist1 (talk) 12:11, 11 November 2022 (UTC)
@Estopedist1 What is suspicious? What sources are needed? I provided a source above, I am just creating entries for them. I am pretty sure WDːN is met, particularly criteria 2. Piotrus (talk) 16:51, 11 November 2022 (UTC)
@Piotrus: Items on wikidata must be able to be "described using serious and publicly available references". Please read WD:Notability. BrokenSegue (talk) 18:46, 11 November 2022 (UTC)
  • WD:N #2 means you need to provide references to serious sources (roughly speaking: subject independent, not user-generated content, not promotional content) for key statements inside the item.
  • WD:N #3 means that the items are in use somewhere in a Wikimedia project. This could be a backlink inside Wikidata from other items (e.g. in references, or maybe even from a (not yet existing) property for this database); it would also be valid to use the item in a Wikipedia project, so that actual data from it is displayed on Wikipedia—an example would be a database link template that somehow uses these items.
MisterSynergy (talk) 18:56, 11 November 2022 (UTC)
@MisterSynergy Thanks for clarifying. If an item (website) is used as a source on Wikipedia (and I don't mean once or twice, I mean thousands of times), would it meet criteria number 3? Piotrus (talk) 02:37, 12 November 2022 (UTC)
Yes. ---MisterSynergy (talk) 05:43, 12 November 2022 (UTC)

Remove phrase "It refers to an instance of a clearly identifiable conceptual or material entity."

I think we should delete the phrase "It refers to an instance of a clearly identifiable conceptual or material entity." in #2

People get way too confused that this means that they can document anything that is "clearly identifiable" (basically anything) and forget to read that it "can be described using serious and publicly available references". The important part is the second part of the rule, not the beginning which can be true for anything. Lectrician1 (talk) 22:21, 4 December 2022 (UTC)

I don’t think that this will change much. People always consider their preferred source to be “serious”. Apart from that, the problem seems to be that there is a lot of tacit knowledge about what constitutes “serious” and that knowledge is ever-evolving. I try to document what this term is supposed to mean but this document hasn’t really caught anyone’s attention. So it’s probably not that big of a problem for most users (or my try is really bad). --Emu (talk) 22:28, 4 December 2022 (UTC)
It might not prevent people from creating non-notable items, but it will stop them from trying to argue that it's "clearly identifiable" which is just bogus. Having notable sources on the other hand is a lot harder to argue. Lectrician1 (talk) 22:32, 4 December 2022 (UTC)
I don't think it should be deleted, because items in Wikidata should refer to a specific concept, that is, they shouldn't refer to multiple concepts (aka Wikidata:Conflation). The part that you want to delete is important in this sense.
Perhaps a possible solution would be to explicitly use "and" between the two parts. The current content of #2 is (revision as of the time of this writing)

It refers to an instance of a clearly identifiable conceptual or material entity. The entity must be notable, in the sense that it can be described using serious and publicly available references.

I propose it is changed to

It refers to an instance of a clearly identifiable conceptual or material entity and it must be notable, in the sense that it can be described using serious and publicly available references.

-- Rdrg109 (talk) 22:32, 4 December 2022 (UTC)
I don't think it should be deleted, because items in Wikidata should refer to a specific concept, that is, they shouldn't refer to multiple concepts (aka Wikidata:Conflation). The part that you want to delete is important in this sense.
This is our notability policy though, not our data quality policy. New editors particularlly would still consider anything to be a "clearly identifiable conceptual or material entity" and might not even realize it's conflated. So, including this phrase is both useless and distracting. Lectrician1 (talk) 22:38, 4 December 2022 (UTC)
Yes, they are two different policies, but this doesn't mean they can't be aligned.
The reason why I proposed using "and" is that when new users read this sentence, the two conditions will be made clear to them. Currently, I feel that there exists the possibility that #2 is understood as a single condition which is stated in the first part (It refers to an instance of a clearly identifiable conceptual or material entity.) and the second part (The entity must be notable, in the sense that it can be described using serious and publicly available references.) is understood as a rephrasing of the first part.
-- Rdrg109 (talk) 22:50, 4 December 2022 (UTC)
But why do we need the first part? It's useless. Lectrician1 (talk) 23:09, 4 December 2022 (UTC)
In Wikipedia, even if there are some guides about identifying reliable sources, there are no precise definition of this term.--GZWDer (talk) 23:44, 4 December 2022 (UTC)

What about rephrasing it as:

  1. It refers to a notable entity in that
    1. It refers to a clearly identifiable conceptual or material entity and
    2. It can be described using serious and publicly available references.

?

BrokenSegue (talk) 00:19, 5 December 2022 (UTC)

The proposal is to remove #1. If you want to propose something otherwise, do it in a new topic. Lectrician1 (talk) 01:00, 5 December 2022 (UTC)
the problem you identified is people reading half of the definition and ignoring the other half. I'm suggesting a way to emphasize that there are two equal halves. I will admit that it's very hard to have something meet the latter and not the first condition. BrokenSegue (talk) 01:23, 5 December 2022 (UTC)
And I'm also saying that the first part is unnecessary... Lectrician1 (talk) 03:54, 5 December 2022 (UTC)
It might seem useless if you think about people, most organizations and most buildings. But Wikidata is a database about the whole world. This provision essentially safeguards against certain cases of bad modeling. If it can’t be properly described using Wikidata’s modeling, it can’t be notable (at least under the provision of WD:N #2). --Emu (talk) 13:57, 5 December 2022 (UTC)
@Emu What's an entity that wouldn't meet the first part of rule that someone could create an item for? Lectrician1 (talk) 17:35, 5 December 2022 (UTC)
"Star Wars" without identifying whether the item is for the first movie, the whole franchise, the novel, comic book, video game, theme park attraction, etc. –LiberatorG (talk) 18:35, 5 December 2022 (UTC)

  Oppose: If #2 required actually adding such serious and publicly available references then the first part may not be needed, since those references would hopefully clearly identify the item. But we only require that it can be described using such sources, so the first part is needed. Otherwise someone could add an item for "John Smith" without clearly identifying which John Smith the item refers to. If it is connected to a Wikipedia disambiguation page then it is allowed under #1, but even if there are serious and publicly references describing some John Smith it should not be allowed under #2 since the specific entity is not clearly identified and so other people won't be able to add those references or make use of the item. –LiberatorG (talk) 18:18, 5 December 2022 (UTC)

That makes sense. I see why we need it now.   Resolved Lectrician1 (talk) 18:40, 5 December 2022 (UTC)

Pages ending in .js

"Pages ending in .css or .js are not allowed."

The sitelinks for Node.js (Q756100) end in ".js", so I think this should be more specific. -wd-Ryan (Talk/Edits) 01:31, 15 December 2022 (UTC)

I think this refers to pages with content models "css", "sanitized-css", "javascript", and maybe also "json". —MisterSynergy (talk) 07:50, 15 December 2022 (UTC)
I have updated the draft to say "content model" instead of "Pages ending in .css or .js" Lectrician1 (talk) 16:01, 15 December 2022 (UTC)

Criteria #1 was reworded but the new wording is sloppy and conflates what is valid and what contributes to establishing notability. For example the previous wording was very clear that although sitelinks to redirects are permitted, they do not count toward establishing notability; the rewording says that they are "allowed" but nothing about the fact that they do not establish notability. Other things like subpages in the Portal namespace are now missing their rule entirely in the rewording. It would probably be best to avoid words like "not allowed" since it is not clear what that means; instead focus on what does or does not establish notability since that is the purpose of this page. –LiberatorG (talk) 09:50, 15 December 2022 (UTC)

Yes, yesterday’s overhaul of the policy page by User:Lectrician1 is a botch job in its current form. The policy does have its problems and it should be improved, but I would have preferred to see a draft that we can discuss before everything is redone. —MisterSynergy (talk) 10:09, 15 December 2022 (UTC)
@MisterSynergy I totally agree. I seem to lack the skills to undo the revisions because of some translation stuff going on. Can you or can somebody else restore the original version? --Emu (talk) 12:09, 15 December 2022 (UTC)
Undone for now. We can still discuss proposals that aim to improve the policy page, though. —MisterSynergy (talk) 12:25, 15 December 2022 (UTC)
I have created Wikidata:Notability/Sitelink criteria draft to draft a more-organized sitelinks criteria section. Feel free to edit it. Lectrician1 (talk) 16:00, 15 December 2022 (UTC)
For example the previous wording was very clear that although sitelinks to redirects are permitted, they do not count toward establishing notability; the rewording says that they are "allowed" but nothing about the fact that they do not establish notability.
I have updated the wording in the draft to say: "Redirects to notable pages are allowed as sitelinks."
Other things like subpages in the Portal namespace are now missing their rule entirely in the rewording
I have added it in the draft.
It would probably be best to avoid words like "not allowed" since it is not clear what that means; instead focus on what does or does not establish notability since that is the purpose of this page.
I used the wording "not allowed" because it is very straightforward, particularlly to newcomers, that a specific type of sitelink is not allowed to be a sitelink on Wikidata. Saying whether a sitelink is "notable" for every rule is a lot more confusing, especially if someone doesn't know what "notability" means. Lectrician1 (talk) 16:07, 15 December 2022 (UTC)
Redirects to notable pages are allowed as sitelinks.
A large number of redirects are variants, alternative names or misspellings of the original title, which does not merit an item. I propose the following instead:
(option 1 which is functionally identical with the status quo)
Redirects are allowed as sitelinks but will not count towards establishing notability.
(option 2)
Redirects counts towards establishing notability if:
  • The redirect target is a main-namespace article. (if the redirect target is in another namespace, such sitelinks are allowed but will not count towards establishing notability.)
  • The redirect refers to a identifiable entity that is mentioned in the target article.
An item for a template must have at least 2 valid sitelinks, and any of them must not be one of /doc, /XML, /meta, /sandbox, /testcases or /TemplateData subpages. Items for non-subpages can be created with only one sitelink, but shouldn't be created in great numbers.
For some discussion about the last point see Wikidata:Project_chat/Archive/2016/07#Notability_of_items_for_templates and Wikidata:Project_chat/Archive/2018/11#Item_about_Template_with_only_one_sitelink. The pre-2016 version reads simply "If a link is a subpage of a template, the item must contain at least two such sitelinks, and any of them must not be one of /doc, /sandbox, /testcases or /TemplateData subpages." and allows all non-subpages.
The status of subpages of mainspace pages (for example, individual chapters) is undetermined.
We already have many of such items and should we make them formal?
--GZWDer (talk) 16:33, 15 December 2022 (UTC)
Option 2 is good because it specifies what redirects are notable.
Yes, please remove that single template sentence. I forgot to do that.
We already have many of such items and should we make them formal?
Sure. Lectrician1 (talk) 16:42, 15 December 2022 (UTC)
There are no redirects that count towards establishing notability. This was very clear in the original text and has been discussed previously. Of course the item can still be notable due to criteria #2 or #3 or other sitelinks that aren't redirects, but if it can't be described by serious publicly available sources then it doesn't get a free pass just because of a redirect. The free pass only goes to things that were determined to merit an actual article.
The purpose of this criteria #1 is to meet the Wikidata goal of centralizing interlanguage links; that is why we allow things that can't be described by serious publicly available sources. For a redirect, the user will be redirected to another article with its own interlanguage links so that does not apply. However if the entity has a full article in one Wikipedia but is just a section in another Wikipedia, we do allow a sitelink to a redirect that redirects to that section so that it is easier to find that from the language with the full article. However if the only sitelinks are redirects then that won't help with this goal and the item needs to meet criteria #2 or #3 to be notable. –LiberatorG (talk) LiberatorG (talk) 18:36, 15 December 2022 (UTC)
I have recently added the redirect sitelink badges to redirect sitelinks. From my experience, there is liberal use of redirects in Wikipedias, many of them are legit sitelinks even if they point to some article or section with really scarce information about the entity. I do not think that sitelinks to redirects should confer any notability; it is beyond our control how they are used in client wikis. —MisterSynergy (talk) 18:49, 15 December 2022 (UTC)
I have revised this page.--GZWDer (talk) 19:34, 15 December 2022 (UTC)

Revise criteria #2 wording

Following the suggestions of @Rdrg109 and @BrokenSegue above (thank you!) I'm proposing to reword criteria #2 so that is simpler, more understandable, and easier to read.

Current wording:

It refers to an instance of a clearly identifiable conceptual or material entity. The entity must be notable, in the sense that it can be described using serious and publicly available references.

Proposed wording:

It refers to an instance of a clearly identifiable conceptual or material entity that can be described using serious and publicly available references.

@Emu @LiberatorG @GZWDer Lectrician1 (talk) 17:41, 13 December 2022 (UTC)

sounds good to me. BrokenSegue (talk) 17:54, 13 December 2022 (UTC)
  Strong support--Estopedist1 (talk) 18:32, 13 December 2022 (UTC)
  Support Sure. ArthurPSmith (talk) 19:42, 13 December 2022 (UTC)
  Support even if it is still intentionally vague, it is better than nothing.--GZWDer (talk) 21:02, 13 December 2022 (UTC)
  Support See Special:Diff/1784600409 for my reasoning on supporting this change. -- Rdrg109 (talk) 22:09, 13 December 2022 (UTC)
no objection --Emu (talk) 22:41, 13 December 2022 (UTC)

  Done https://www.wikidata.org/w/index.php?title=Wikidata:Notability&diff=1806697286&oldid=1790995087&diffmode=source Lectrician1 (talk) 16:32, 8 January 2023 (UTC)

Consider the Necessity of Completeness

I have noticed recently that a large number of items I have created have been nominated for deletion. I have found this rather disappointing because it seems that I made an assumption about the purpose of Wikidata that isn't necessarily shared with others.

As an example, I created an item for a single McDonald’s (Q38076) location. This specific location was deemed not notable and was unceremoniously deleted.

This would be perfectly reasonable if we were creating an encyclopedia, but Wikidata is a database. I enjoy contributing to Wikidata because I believe databases have super powers that other works couldn't possibly have. They have the power to answer difficult questions that we couldn't possibly answer without one.

For instance, if we wanted to answer the question:

What counties in the United States of America (Q30) don't have a McDonald’s (Q38076) location?

It would be extremely difficult to answer that question without a database of all of the counties in the United States of America (Q30) and a database of all of the McDonald’s (Q38076) locations in the United States of America (Q30) linked to what counties they reside in. We likewise couldn't create educational content for an encyclopedia. For example, if we wanted to create a graph of all of the McDonald’s (Q38076) that have opened and closed in Texas (Q1439) in the 1990s we wouldn't be able to do that without having that data, in a database, that we could query.

Not only would a large number of non-notable McDonald's locations need to exist in Wikidata, but there would also need to be a near complete dataset of all of the McDonald's locations that have ever existed.

I'm using McDonald's as an example here, but I hope you see how this could apply to many different knowable queries. For example:

I'm disappointed that we have, in what appears to be a zealousness to preserve a sense of notability of the content, destroyed the very mechanism that makes Wikidata a valuable knowledgebase. If it can't answer knowable questions about the universe, what exactly is the value of Wikidata?

To be clear, I am not saying that we need to have an item for every atom in the universe. What I am saying is that it might be beneficial to think about the consequences of deleting some of these records, when they hold tremendous value when the database is "complete"

Therefore, I propose that instead of deleting non-notable items, that we instead create a property that indicates that they are non-notable. This can be used to demote the items in search, visually indicate that they are non-notable, and can be used as a filter in a query if the non-notable items are of no interest. I think this would solve the problems the notability policy attempts to solve, while also not diminishing the value of the database in the future. U+1F360 (talk) 06:57, 13 January 2023 (UTC)

There aren’t serious (independent, third-party) sources per WD:N #2 for most of those restaurants. And that’s really all there is to say of the matter, I guess. Wikidata aims to create a secondary database of known facts, not searching the real world for facts that can’t be independently proven. --Emu (talk) 22:21, 13 January 2023 (UTC)
If that is the case, then I can only suggest that Wikidata:Introduction is updated so others are not drawn into contributing to this project (like I was) under the pretense that it could provide additional value beyond its individual items. U+1F360 (talk) 23:08, 13 January 2023 (UTC)
Okay, but what in Wikidata:Introduction should we change? The key points seem to be right here: “Wikidata editors, who decide on the rules of content creation and management” and “A secondary database. Wikidata records not just statements, but also their sources, and connections to other databases. This reflects the diversity of knowledge available and supports the notion of verifiability.” --Emu (talk) 23:45, 13 January 2023 (UTC)
I think it could be useful to state somewhere something like what you said: "Wikidata aims to create a secondary database of known facts, not searching the real world for facts that can’t be independently proven." In other words, it doesn't exist to create a database of (named) things, only things which have been independently verified. Whether they can be verified (by a human) is irrelevant and by whom is of ultimate importance (independent and serious). Therefore it is impossible to use Wikidata to answer any sort of question that relies on aggregation. U+1F360 (talk) 00:02, 14 January 2023 (UTC)
I would argue that Wikidata does not meet the definition of a database, because the collection is meaningless, other than, I suppose, that the collection's items have been independently verified by serious sources. That meaning has effectively no value as an aggregate and therefore isn't a collection of anything meaninful. U+1F360 (talk) 00:27, 14 January 2023 (UTC)
To me Wikidata is so obviously a database that I can't even imagine what you are trying to argue. I do though have sympathy for the idea that an individual McDonald's could be notable enough. I don't think our notability guidelines are particularly well designed (particularly nowhere is it explained what a serious source is). I would support an effort to try to make this more rigorous. It's hard to write a rule that includes individual McDonald's but excludes an unknown SoundCloud rapper. Maybe there's a good reason for that. BrokenSegue (talk) 00:43, 14 January 2023 (UTC)
I mean it's a database in the sense that the collection is "known facts". I just don't understand the value of the collection is to anyone. What could I possibly do with that as an aggregate?
I suppose what we could do is add to the list of Wikidata:What Wikidata is not, because it's not a collection of anything other than known facts and the bar for what is "known" seems rediculously high. It can't be known by a bunch of humans in a community, it has to be known by a "serious" and "independent" source. U+1F360 (talk) 02:07, 14 January 2023 (UTC)
You can do lots of things with this information. An example of something I've done is can lookup the gender of everyone in the past 10 US presidential administrations and compute the gender ratio. There are literally millions of things you could do. Plot films' metacritic scores against their rotten tomatoes scores. Look at what film genres are the most profitable. It literally goes on forever. BrokenSegue (talk) 07:17, 14 January 2023 (UTC)
This assumes that all of the data items in your query are notable themselves, which ironically, if the items are deleted, then this fact is unknowable. How did you know that all (without exception) US Presidential Administrations would be notable? What about films that are on metacritic and rotten tomatoes, but not on Wikidata? Would your plot be accurate? U+1F360 (talk) 15:23, 14 January 2023 (UTC)
I suppose what we are saying is that users of Wikidata are required to intuit the notability of all of the items in their query? I don't understand why we would ask our users to perform such a task which seems impossible to me? U+1F360 (talk) 16:35, 14 January 2023 (UTC)
If you aren't sure whether a US presidential administration is notable then we are having very different conversations. Few things have more serious sources discussing it than a US presidential administration. And Wikidata will never be complete/perfect so your worries about missing entries will always be present no matter the notability policy. BrokenSegue (talk) 17:01, 14 January 2023 (UTC)
My point is that you were able to intuit that the items were notable, for many things it's impossible to intuit that. Could you describe the steps you took to know that all US presidential administrations (of which you may not be familure with) are notable according to Wikidata's policies? U+1F360 (talk) 17:58, 14 January 2023 (UTC)
I cannot explain the thought process. It is so obvious to me and I can't imagine something substantially more notable than a US presidential administration. I mean all of those items also have Wikipedia articles so they by default are notable in Wikidata. I do agree the notability policy is poorly specified but even the current bad wording makes it very clear that a presidential administration is notable. There are obviously serious books written about every presidential administration. That isn't true of every McDonalds location. BrokenSegue (talk) 18:03, 14 January 2023 (UTC)
To be clear, you're requiring the user to inuit this knowledge about whatever subject they happen to be interested in? U+1F360 (talk) 18:19, 14 January 2023 (UTC)
Before making an item on wikidata you should check that it is notable. So before making an item about a presidential administration maybe search a library's catalog for a book about the administration. No intuiting is needed. As I said the main issue is deciding what sources are serious (but widely published books in common libraries are pretty obviously serious). BrokenSegue (talk) 18:30, 14 January 2023 (UTC)
I'm asking about whether a user, who I assume to make database queries rather than contributions. It seems like they would need to know that all of the items that could be used to generate their query result are notable or not before running their query? U+1F360 (talk) 18:34, 14 January 2023 (UTC)
A user querying wikidata always has to be aware of the fact that the data they are querying might be incomplete/missing for various reasons. One reason it may be incomplete is that the items they are querying are not sufficiently notable. This is an inherent limitation of a database that follows the open-world assumption (Q851949). There is still value in a database that follows the open world model. BrokenSegue (talk) 18:56, 14 January 2023 (UTC)
Can I ask, what references exactly did you include in the deleted MacDonalds item? --Tagishsimon (talk) 01:04, 14 January 2023 (UTC)
I noted that it was linked in 6 other databases that we have properties for and the item included an official reference on McDonald's own website. All of that was insuffiencint in establishing notability.
To be clear, I'm not dissapointed that the item was deemed "not notable", I'm dissapointed that our mechanism for dealing with non-notable items is deletion. A more approriate database design would be to flag these items as non-notable (or the reverse if we would prefer). U+1F360 (talk) 01:52, 14 January 2023 (UTC)
In my view the main point of deleting non-notable things is to prevent them from clogging up Wikidata. If there are a million people named "John Smith" in the database it makes reconciliation difficult. Merely flagging them doesn't save us storage/compute unless they were basically inaccessible/unlinkable from the notable entries. And guess what that's basically what deletion does because deleted items are preserved forever and can be undeleted. So it basically is a "flag" as you asked for and the main consequence of the flag is that only admins can see the item. BrokenSegue (talk) 07:20, 14 January 2023 (UTC)
As an expert software engineer, I think this is the wrong solution for this problem. There are better solutions to solve this problem (which I outlined in the original post) then deletion. It should be somewhat trivial for non-notable items to remain editable, but demote them in discoverability to prevent them from "clogging up" the user interface. U+1F360 (talk) 15:30, 14 January 2023 (UTC)
Well I am also an "expert" software engineer and I can tell you that this isn't a question of software engineering but of policy. And it makes no sense to have items marked as non-notable searchable/linkable. Do you imagine that people will figure out what is notable by searching up non-notable items? Instead of reading a policy on notability? What point is there in a notability policy if we don't delete anything? And even if you are right and there is some technical solution to this problem it's immaterial because there isn't a dev team to implement the features you want to make this happen. BrokenSegue (talk) 17:01, 14 January 2023 (UTC)
Notability is a problem and deletion is a solution. There are other solutions to this problem and there could be multiple solutions to the problem which could be implemented differently depending on the item. My point is mostly that we might want to consider the value of the collection rather than the value of the item. U+1F360 (talk) 18:03, 14 January 2023 (UTC)
@U+1F360: Without commenting on the notability of McDonald's restaurants, I think it would be helpful to clarify two issues; one regarding terminolgy, the other policy.
The term "notability" in this context means "suitability for inclusion in Wikidata". By definition, something non-notable won't be included; something notable should and may be. The concept of including but tagging an item as "non-notable" is thus contradictory.
For a set of items deemed non-notable, there is nothing stopping you, or any other individual, company, or public body, creating a separate Wikibase instance, and federating queries with Wikidata to achieve the desired results. Indeed, it is intended that an ecosystem of Wikibase instances should exist.
This does not mean that you cannot make an argument that each individual McDonald's restaurant is notable in Wikidata terms. The community at large will decide such cases, and you may argue it more publicly (or post a link to this discussion) at Wikidata:Project chat. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:17, 14 January 2023 (UTC)
If that's the definition, and deletion is the only acceptable technical solution, then I believe Wikidata is missing the forest by focusing on the trees and that is a shame. U+1F360 (talk) 15:42, 14 January 2023 (UTC)
"This does not mean that you cannot make an argument that each individual McDonald's restaurant is notable in Wikidata terms."
Why can't I make the argument that the collection of individual McDonald's restraunts is notable, rather than arguing on behalf of literally each individual item? U+1F360 (talk) 18:47, 14 January 2023 (UTC)
@U+1F360: I think you're missing the main point of the notability policy. As a database that anyone can edit, it is important that we are able to check the contributions to ensure that they are accurate. The notability policy outlines what we are eqipped to verify; generally we verify information using publicly available serious sources, typically provided as references and identifiers in the item. If we allowed information that we cannot verify, Wikidata would quickly become overrun with spam and false information. Even if we knew that you are trustworthy and the information you put in your new item was completely accurate, when someone else comes along and "updates" some information in that item we need to be able to verify that the update is accurate, and we are not eqipped to do that for items that are not covered by publicly available serious sources.
That said, there are other models that could be used to verify accuracy. For example OpenStreetMap relies on volunteers in each local area to verify data in their area, and this may be a better fit for a database of local information that isn't necessarily covered by publicly available serious sources. IMDb has a different model, tailored to verifying information about films. Wikidata items can link to OpenStreetMap, IMDb, and many other external databases, and OpenStreetMap also links to Wikidata; by linking together several databases that each have their own unique strengths we have access to more than what would be possible under the model of any of the individual databases. –LiberatorG (talk) 17:30, 14 January 2023 (UTC)
I understand that we need a notobility policy and to be perfectly clear, I support having a policy. I do not see the value in having an item for the box of cereal I just purchased. In that instance there isn't a value in the item and there is also not a value in the collection. What would be the value of having a database of every single box of cereal ever sold? I completely understand that that isn't valuable.
The point I'm trying to make is that sometimes individual items may not be "notable", but the collection is. Having a collection of movies for instance, seems valuable to me, even if some of the movies are not notable.
Wikidata doesn't trust other databases in establishing notability. The properties you mention IMDb ID (P345) and OpenStreetMap relation ID (P402) are both Wikidata property for an identifier that does not imply notability (Q62589320) so even if a film was found in IMDb or a place in OpenStreetMap, that does not meet our standard of notability. Therefore, you cannot make assumptions about the aggregate of movies on Wikidata. I can't say things like "What was the average ticket sales by film produced in 1980s in France?" because Wikidata can't reliably answer that question since some of the movies were "not notable" and thus the average would be skewed. The user isn't even given the option to exclude films from their query because they just don't exist. U+1F360 (talk) 18:17, 14 January 2023 (UTC)
To be a little more succinct: Our current notability policy focuses on the notability of the item, I think we should also consider the notability of the collection. U+1F360 (talk) 18:29, 14 January 2023 (UTC)
I'm actually sympathetic to the idea that a collection could be notable while an individual item could be not be. This I think is the real rationale for having items for every published paper in wikidata even though many published scholarly works are read 0 times. Try to convert this idea into a rigid policy we could enforce though. BrokenSegue (talk) 18:58, 14 January 2023 (UTC)
That's a great question... how do we evaluate the value of a collection that may not presently exist? U+1F360 (talk) 19:08, 14 January 2023 (UTC)
You can’t. Case in point: I would be very interested in a database of every box of cereal ever bought. Maps could show cereals sales on a county level, we would finally know the last box of cereals that assorted celebrities last ate before their untimely death, there would be listicles on social media with fancy titles like
  1. "The All-Time Best-Selling Cereals: A Look at the Top 10 Brands"
  2. "Cereal Flavors Through the Decades: The Most Popular Tastes of Each Era"
  3. "The Rise and Fall of Cereal Brands: A Look at the Winners and Losers"
  4. "Cereal by the Numbers: A Breakdown of Sales by Demographic"
List created by ChatGPT. --Emu (talk) 19:21, 14 January 2023 (UTC)
If we can't figure this out, then Wikidata is on course to only be a way to link wikimedia wiki's together... which seems tragic to me. U+1F360 (talk) 20:00, 14 January 2023 (UTC)
Just throwing this out there as an idea... but I wonder if we could say that a public database exists (whether open or not) that that is an indication of notability of a collection? That would work for movies, McDonald's locations, but it wouldn't work for cereal boxes. U+1F360 (talk) 20:12, 14 January 2023 (UTC)
But isn’t that pretty much our current concept of notability? Some identifiers (and most printed encyclopedias, etc.) confer notability. That’s not too far from a collection of notable concepts. --Emu (talk) 20:24, 14 January 2023 (UTC)
In my example, there is obviously a database of McDonald's locations and the item in question was linked to 6 other/independent databases (OpenStreetMap, Who's on First), some of who's sole intent is a database of eateries or local venues (i.e. Yelp, UberEats). So I would say no, it is not, or our policy is not clear enough. U+1F360 (talk) 20:32, 14 January 2023 (UTC)
This is where we run into vagueries of the current rules. Does this make every restaurant in Yelp notable? Does this also make every soundcloud rapper notable? Everyone with a facebook profile? In some sense as currently written the rules do allow you to make an item for every McDonalds. Personally I think it's unclear. BrokenSegue (talk) 20:56, 14 January 2023 (UTC)
To an extent, this already is the case with our infamous structural needs notability per WD:N #3. --Emu (talk) 19:21, 14 January 2023 (UTC)
  Info Based on the above conversation, I would like to revise my proposal. I think it would be better if we could revise the notability policy to allow for the notion of the notability of a collection rather than the current criteria which focuses exclusively on items. I'm open to feedback on this, but an idea I had (above) is that we could consider public databases as an indication of notability for a collection (i.e. "IMDb is a public database of movies, therefore a collection of movies is notable") or something like that? U+1F360 (talk) 23:23, 14 January 2023 (UTC)
In this way, properties like IMDb ID (P345) would be a Wikidata property for an identifier that suggests notability (Q62589316), since we recognize the notability of the collection. U+1F360 (talk) 23:41, 14 January 2023 (UTC)
But IMDb ID (P345) is a Wikidata property for an identifier that does not imply notability (Q62589320) and rightly so (it’s user-generated content with unclear quality standards and it isn’t even complete in the sense that you want it to be). Other public databases (like Library of Congress authority ID (P244)) on the other hand do suggest notability. So your proposal should explain how this additional collection clause is different from current notability criteria. It might be a good idea to come up with several examples of concepts that a) aren’t notable under the current regime b) with a consensus that they should be notable anyway and c) an explanation how this new provision might remedy the situation. --Emu (talk) 23:47, 14 January 2023 (UTC)
In my example, we have a few properties that could be used to suggest notability that (at least on their own) don't:
I suppose you could say that individually, none of these suggest notability on their own (since, sure any of them could be bad/wrong), but together I think it seems obvious that this is a thing that belongs in the collection (if we so deem that a collection of restaurants is notable)? If the problem is that databases do not suggest notability because "user-generated content with unclear quality standards", it's hard to believe that, taken together that would be the case? How can we write this into a policy so that it makes sense? U+1F360 (talk) 23:59, 14 January 2023 (UTC)
Without going into specifics (OSM is a special case and I’m not familiar with the specifics of most of those identifers): We have thousands of musicians who have an entry to all sorts of databases, often a dozen of them. Doesn’t make them notable. The sheer multitude of references aren’t of any use if there is no real data curation process. --Emu (talk) 00:13, 15 January 2023 (UTC)
I'm proposing that the musicians are notable, the movies are notable, the places are notable; but your cereal box is not notable. I think the standard of using external databases, whether "user-generated content" or not is a good way to make that distinction, unless you have an idea for a better standard? U+1F360 (talk) 01:33, 15 January 2023 (UTC)
When you think about the scope of items between a "non-notable" musician on Soundcloud and your cereal box, the musician seems extremely notable. U+1F360 (talk) 01:38, 15 January 2023 (UTC)
allowing random soundcloud rappers creates practical problems. there's generally no reliable sources about them. we would also be overwhelmed with spam if we let literally anyone make an entry. it's already hard enough to manage wikidata's scale. BrokenSegue (talk) 05:12, 15 January 2023 (UTC)
I think that means we need to get better at automating more tasks? We could even automate the deletion of non-notable items if that is useful. U+1F360 (talk) 14:39, 15 January 2023 (UTC)
To further clarify, you're pointing out a problem: "we would also be overwhelmed with spam if we let literally anyone make an entry. it's already hard enough to manage wikidata's scale.", but I don't believe a good solution is to use the notability policy to reduce the number of items, since that solution, as I've described, makes Wikidata useless as a database of knowledge. I think a better solution to the problem would be to increase the automation used in performing these tasks. U+1F360 (talk) 14:49, 15 January 2023 (UTC)
Automation is useful and already heavily used but it will never solve all spam issues, at the very least not in the near future. In any case, the notability criteria have by and large worked for a very long time. Why should we abandon them because you want to query a bunch of McDonald’s venues? If you are interested in them, it’s pretty easy to come up with a solution that doesn’t need Wikidata. --Emu (talk) 16:20, 15 January 2023 (UTC)
Again, that is an example, I've provided many others. With the current notability policy, which only considers the notability of items rather than collections, Wikidata is only useful as a structured data repository for Wikimedia wikis, it cannot serve any other purpose as I've demonstrated repeatedly. I guess the question I'm asking you is: What do you want Wikidata to be? I was under the impression that it was more than that, but if it's not that's fine too. I mostly want to make sure we are all on the same page and not pretending that Wikidata is something that it isn't. U+1F360 (talk) 17:03, 15 January 2023 (UTC)
We aren’t. You are reading something into Wikidata that it is not. --Emu (talk) 21:17, 15 January 2023 (UTC)
Well that is fine then I will be off. Good day to you all, it has been a fun time but I really have no desire in helping build a repository of structured data for Wikimedia wikis, which is not the same thing as building a database of notable collections. U+1F360 (talk) 22:17, 15 January 2023 (UTC)
Your complaints about Wikidata apply to every open-world database which is basically all databases of a comparable size. Does Google's Knowledge graph have no use? Or Freebase? Google paid a lot for Freebase. It really seems you don't have a lot of experience with knowledge graphs. BrokenSegue (talk) 21:49, 15 January 2023 (UTC)
Google's Knowledge graph and Freebase find value in the collection, Wikidata does not that is a fundamental distinction that makes Wikidata of little use outside of Wikimedia wikis. U+1F360 (talk) 22:20, 15 January 2023 (UTC)
Google uses its knowledge graph to answer basic questions for users. E.g. the age of arnold schwarzenegger. The infoboxes Google shows also are drawn from the knowledge graph which drawns on familial relationships, awards etc. Why does Google see value in a knowledge graph of that sort and you don't? BrokenSegue (talk) 02:24, 16 January 2023 (UTC)
I think increasing automation is a good idea and we should do it. How about we invest in that now (it will also require some RfCs to approve the use of automation). And revist this topic in a couple months. I hear you are an expert programmer. BrokenSegue (talk) 18:18, 15 January 2023 (UTC)
I think this would happen naturally if we aren't removing people like myself from contributing to the project because of the notability policy. In other words, the vision of what the project desires to be should proceed everything else. I either have completely misunderstood that vision, in which if that is the case, I don't have a desire to be a part of it; or the vision is misunderstood by a large number of admins, in which, we need to clarify that vision and align our policies accordingly. U+1F360 (talk) 19:02, 15 January 2023 (UTC)
"the vision of what the project desires to be" is in part guided by the community. I don't think there is even consensus on this amongst admins. Wikidata has multiple objectives in multiple directions. BrokenSegue (talk) 02:26, 16 January 2023 (UTC)
(edit conflict) IMDB is community edited with basically no notability standards. Anyone can get an entry on there. Are you saying that even the least notable actor that made themself an entry on IMDB should have an entry on wikidata? Further what are the criteria for notable collections? The main reason for our current notability standard is that if there aren't serious sources then none of the statements on the items can have good references. If literally all we know is that they are an actor named "John Smith" in what sense can the wikidata item say anything meaningfully about them. There may be something here but I suggest you spend some time reading Wikidata:Requests for deletions candidates for a few days before proposing changes to the notability policy. BrokenSegue (talk) 23:48, 14 January 2023 (UTC)
I think you and @Emu have a good point and maybe we just need the equivalent of Library of Congress authority ID (P244) but for places? or whatever collection we deem notable for that matter? U+1F360 (talk) 00:07, 15 January 2023 (UTC)
  • Wikidata is never fully complete. We can have more or less data but we will never have all data. Adding individual McDonalds locations doesn't allow anyone to run queries for which countries don't have a McDonalds location because you never know whether the lack of McDonalds locations for that is due to no Wikidata editor creating an item but because there's actually no location in the country. To run a query like that we would need a bot that imports all McDonalds locations by for example using a list provided by McDonalds itself and that updates the entries to keep them up to date a McDonalds closes locations and adds new ones. You creating a bunch of items for individual locations would make the job of someone running such a bot harder because they have to do the additional task of merging all your individual locations into the ones important by the bot.
By design Wikidata has a bot approval process. If you want to do something like important all McDonalds locations, that's the venue to discuss it. ChristianKl15:18, 17 January 2023 (UTC)

Notability of bachelor's thesis and master's thesis?

I am unable to find discussion about the notability of bachelor's thesis and master's thesis. At first glance, I may guess that these are not notable, but we have massive items of them. If a thesis has DOI-code, it is suitable for Wikidata. If we really keep these thesis (without DOI-code) it means that theirs authors meets WD:Notability ('structural need'), but it would be superclutter for Wikidata? Estopedist1 (talk) 09:21, 17 January 2023 (UTC)

I'm not sure even a DOI should count as reason to create an item for these. Do you have rough numbers on how many of these we have now? ArthurPSmith (talk) 13:22, 17 January 2023 (UTC)
Worth including if they have been cited? Also, perhaps, if the author has become a noted public figure? Jheald (talk) 13:47, 17 January 2023 (UTC)
@ArthurPSmith@Jheald:
  1. as far I know then the DOI identifier makes the item pass WD:Notability.
  2. backlinks for bachelor's thesis (Q798134) are 395 + redirects
  3. backlinks for master's thesis (Q1907875) are 45,000 + redirects
  4. not sure about citing
  5. if a person itself meets WD:Notability then maybe his/her master's thesis is worth of mentioning, but sometimes there may be co-authors, and these should be already infoclutter if standalone items will be made
Estopedist1 (talk) 07:09, 18 January 2023 (UTC)
Some universities like the University of Vienna (Q165980) issue DOIs for all theses that are available online (bachelors theses aren’t eligible for the repository). I’m not sure if I like the idea of mass imports of all those theses including authors … --Emu (talk) 16:49, 18 January 2023 (UTC)
I would oppose including bachelor/master theses en masse, as well as doctoral thesis (sciences) (Q100328456) in parallel system. We haven't even all PhD theses (Q187685?) or Doktor nauk dissertation (Q100328465). The latter I consider most notable of all other classes. --Infovarius (talk) 03:34, 19 January 2023 (UTC)
I would support including bachelor/master theses en masse. If it has a DOI-code, it is able to be identified and in my view qualifies under WD:N #2 Jack4576 (talk) 12:07, 3 September 2023 (UTC)
Concerns regarding 'Superclutter' are not an argument I find particularly compelling. The site is already full of an unwieldy amount of information; which is a feature of WikiData, not a bug. Jack4576 (talk) 12:08, 3 September 2023 (UTC)

Collecting discussions from Special pages

Very likely there are discussions about Special pages as Wikidata items. Important links for future discussions:


  Oppose Those pages are populated by software itself, not by our edits, such items may only used by LTAs and their sock(meat)puppets for their funnies. If all of their reason for creation is for interwiki links, then as per Midleading, we should consider installing Cognate extension on all wikis, and enable it for all Special pages. I just wonder why those creations are not barred by our AbuseFilters? Liuxinyu970226 (talk) 02:50, 24 January 2023 (UTC)
@Estopedist1 nothing more to reply? Liuxinyu970226 (talk) 07:41, 28 January 2023 (UTC)
@Liuxinyu970226: no, nothing more. I opened this discussion because I am quite sure that in future we have problems with Special pages. I just collected up information/discussions about special pages in Wikidata Estopedist1 (talk) 07:54, 28 January 2023 (UTC)

In order to monitor sitelinks for Special pages, there is now a report that my bot will update on a weekly basis: Wikidata:Database reports/Special pages as sitelinks. It does include *all* sitelinks of this type in all of Wikidata. —MisterSynergy (talk) 21:54, 6 July 2023 (UTC)

Return to the project page "Notability/Archive 7".