Wikidata:Property proposal/Baidu Baike ID (2)

Baidu Baike IDEdit

Return to Wikidata:Property proposal/Authority control

   Under discussion
DescriptionEntry in Baidu Baike encyclopedia
RepresentsBaidu Baike (Q803722)
Data typeExternal identifier
Allowed valuesstring
External linksUse in sister projects: [ar][de][en][es][fr][he][it][ja][ko][nl][pl][pt][ru][sv][vi][zh][commons][species][wd].
Number of IDs in source22,991,669
Expected completenessalways incomplete (Q21873886)
Formatter URL$1
See also
  • Wikidata:Property proposal/Baidu Baike ID (2018)
  • Wikidata:Property_proposal/Archive/34#Baidu_Baike_entry (2015)
  • Namuwiki ID (P8885) (November 2020; Korean wiki)
  • MotivationEdit

    Baidu Baike is the de facto version of Wikipedia in Chinese—it has 16x as many entries as Chinese Wikipedia. It's also heavily censored and has myriad neutrality/copyright/etc. issues, factors which have led previous proposals to add a property for it to fail. However, I do not think adding a property ID for an entity does or should carry any sort of implied endorsement, and like it or not, Baidu Baike is probably the best source of encyclopedic information for non-controversial Chinese topics that might be too obscure to have much or any coverage on Wikipedia. Documenting Baidu Baike IDs could presumably have applications for censorship researchers or for identifying areas of expansion for Wikipedia related to Chinese topics. {{u|Sdkb}}talk 19:05, 8 February 2021 (UTC)


    Pinging village pump discussion participants: @BrokenSegue, Lectrician1, GZWDer, ChristianKl:. {{u|Sdkb}}talk 19:14, 8 February 2021 (UTC)

    •   Neutral. Repeat what I said in Wikidata:Project_chat/Archive/2020/02#Is_Wikidata's_purpose_to_provide_links_to_every_(open)_wiki?:
      1. Content is not free - more than 90% of Wikidata property links to non-free content
      2. Advertisement and paid editing - most open wikis does not care it. Wikimedia seems a very special case that treats paid editing seriously
      3. Copyvio - Also, Wikimedia treats copyvio much stricter than other open wikis
      4. Misleading material - Also, Wikimedia treats copyvio much stricter

    --GZWDer (talk) 19:21, 8 February 2021 (UTC)

    • Wasn't there also problem with content from (zh) Wikipedia itself? If yes, maybe WMF doesn't want us to include them. --- Jura 20:21, 8 February 2021 (UTC)
    •   Support there is value in these links even if the site is "bad". I wouldn't support linking to something like conservapedia but if this is the primary chinese language encyclopedia then we really should link to this. We do not restrict ourselves to linking only to "free" sources elsewhere (e.g. IMDB) so I don't buy that argument (also applies to paid editing). BrokenSegue (talk) 20:36, 8 February 2021 (UTC)
    •   Support   Oppose Same great reasoning as User:BrokenSegue, but I recognize the copyright issues User:Sanmosa brings up. I don't think we should ever platform a service that goes against something we defend. If Baidu Baike ever expanded to the west and added translations for their articles, this would be completely unacceptable with their copyright violations. --Lectrician1 (talk) 21:40, 8 February 2021 (UTC)
      @Lectrician1: This really goes to the heart of a central question about Wikidata's identity: Are we trying to be a comprehensive, neutral knowledge base of the world's information, or are we just an arm of the WMF that supports Wikipedia and restricts information that doesn't align with Wikipedia's values? I think we would err badly to adopt the oppose view, since our credibility comes precisely from the fact that we are neutral, and requiring entities to share our values before documenting them would create the impression that we are endorsing every entity we document and open up a can of worms about what qualifies. We don't want to go down that path. {{u|Sdkb}}talk 04:05, 17 February 2021 (UTC)
      But we are going down that path. I don't know many other Wikimedia projects that face the "platforming" issue we have right here. Projects expect sources that are reliable, but being reliable means that the source isn't political. Here we have to be political. I believe that the platforming of false knowledge is wrong. Baidu Baike might not be false knowledge like Conservapedia, but if you're not willing to respect other members of society's rights to intellectual property through copyright, then you're technically suppressing the future potential of shared knowledge (because nobody will want to share anymore). That's a direct conflict with our movement.
      I also think that there exists no "neutral". Everything and everyone must be political with bias in order to remain logically consistent. Here, we need to remain consistent with our values and consider the direct factors of "platforming" this entity. It might seem that we're increasing shared knowledge, when in reality we are not. I also think my "if Baidu came to the West" example situation is very valid too.
      --Lectrician1 (talk) 12:14, 17 February 2021 (UTC)
    • @ShadessKB, Gotitbro, Eroux108: @Shizhao, YFdyh000, Zhuyifei1999, Stang, Junjie Yuan: @白布飘扬, 虹易, Sanmosa, Blissghost: @Shangkuanlc, Artoria2e5, WQL, 1233, Brror: it would be good to have some input from the previous participants as the open question remained unanswered. --- Jura 19:41, 16 February 2021 (UTC)
    •   Weak support If Wikidata is a neutral data source rather than a free web resource advocate, it is reasonable to provide links of considerable quality.--YFdyh000 (talk) 19:50, 16 February 2021 (UTC)
    •   Comment Frequent poor disambiguation and duplicate articles. Might cause data quality issues. --Artoria2e5 (talk) 23:41, 16 February 2021 (UTC)
      • Data quality issues are common problems for most user-generated sites, not only this one.--GZWDer (talk) 16:31, 20 February 2021 (UTC)
    •   Oppose. Adding Baidu Baike links? Never, never, never. Having the same opinions as GZWDer and Artoria2e5, besides, Baidu Baike is nothing but mostly a copycat of Chinese Wikipedia with serious copyright violations. Sæn 23:47, 16 February 2021 (UTC)
      I see that you are active on Chinese Wikipedia. There are certainly issues with Baidu Baike copying, but I fail to see how it could be "mostly a copycat" if it has 16x as many entries as the thing it's allegedly copied from. {{u|Sdkb}}talk 04:05, 17 February 2021 (UTC)
      Baidu baike contains many articles copied from other website than Wikipedia and other encyclopedias. Yeah it seems a common problem for user-generated sites.--GZWDer (talk) 16:32, 20 February 2021 (UTC)
    •   Oppose. Baidu Baike have a terrible copyvio problem same as last time. I don't think it's a good idea to add a Baidu Baike Property.--LaMagiaaa (talk) 04:41, 17 February 2021 (UTC)
      @LaMagiaaa: Could you clarify what relevance you think them having a copyvio problem has to where or not there should be a property documenting them? As Shangkuanlc pointed out last time, mapping the issue will likely make them easier to resolve. {{u|Sdkb}}talk 05:15, 17 February 2021 (UTC)
      Chinese Wikipedia has compiled an incomplete list. Some content may come entirely from Wikipedia without the correct representation of the copyright agreement. And I don't think a website that infringes copyright should be added. Of course, they may have some original articles, but the average editor cannot distinguish them. So I don't think it is necessary to describe specific articles. The average editor may not be able to distinguish if Baidu Baike added to the property.--LaMagiaaa (talk) 05:29, 17 February 2021 (UTC)
      "terrible copyvio problem" - it is not only the case of Baidu Baike. It really seems most user-generated sites in the Internet have very low control of copyright violation.--GZWDer (talk) 16:28, 20 February 2021 (UTC)
    •   Oppose: Baidu Baike not only has copyright problems, but is also a place for poor and low-quality articles. Many articles contain false information so it fails "probably the best source of encyclopedic information" in the "#Motivation" above. Plus, if the copyright problem is not serious, there wouldn't be a page emphasizing that. Sun8908 💬 07:12, 17 February 2021 (UTC)
      Wikidata does not require any reliability of linked pages. Fandom (Q17459) is also abundant of poor and low-quality articles.--GZWDer (talk) 16:28, 20 February 2021 (UTC)
      @GZWDer: Under this circumstance, we should also create a property for Uncyclopedia, aren't you? --Liuxinyu970226 (talk) 11:42, 13 March 2021 (UTC)
      I actually does not positively endorse creating this property, just to correct others' misconceptions. For Uncyclopedia, another issue is most of entries are not about real topics, due to its satirical and parodic nature (just like Encyclopedia Dramatica).--GZWDer (talk) 05:07, 14 March 2021 (UTC)
    •   Neutral: I don't think copyright problems can be a barrier stopping Baidu Baike getting its property on Wikidata. But as Artoria2e5 said above, poor disambiguation and duplicate articles of Baidu Baike may create chaos on Wikidata. --Steven Sun (talk) 13:01, 17 February 2021 (UTC)
    •   Oppose, per Sun8908--Shizhao (talk) 02:41, 18 February 2021 (UTC)
      • Please note that contents of Wikipedia is duplicated in countless place, even only counting ones without attribuation.--GZWDer (talk) 16:29, 20 February 2021 (UTC)
    •   Support:Previously I did not recommend due to the quality of resources. But I change my mind after look at those facebook ID, youtube ID, twitter ID, which may also full of "good" or "bad" element like advertisement or unreliable information. Anyhow, Wikidata is just a place to collect those ID and provide links, no matter what content in those websites, so I feel OK to let Baidu Baike ID be here. It doesn't means the content of Baidu Baike will be flooding Chinese Wikipedia without restriction. --白布飘扬 (talk) 18:29, 23 February 2021 (UTC)
      @白布飘扬: FB, T... Well, if this user is proposing the " account" property I might be support it, but what are the reasons we should have so? --Liuxinyu970226 (talk) 13:22, 12 March 2021 (UTC)
      Just like other non-Wikimedia Wikis, eg. Fandom article ID (P6262), Chinese Moegirlpedia ID (P5737), Namuwiki ID (P8885), Familypedia person ID (P4193), those links are just naturally neutral. Would it harms our basal policy? It's depend on how the Wikipedian using those info. --白布飘扬 (talk) 16:56, 12 March 2021 (UTC)
    •   Oppose Unless and until concerns from w:zh:WP:BD resolved, copyright isn't a big problem (otherwise we even can't have a property for zh moegirlpedia Chinese Moegirlpedia ID (P5737)), the credibility is however a big problem holds up me from a possible support. Also consulting zhwiki administrators that didn't post comments nor have pinged: @AT, Acepatrick, Alberth2, Alexander Misel, Alexsh:@Alltonight, Antigng, Aoke1989, Aotfs2013, Bluedeck:@BrockF5, Cdip150, Ch.Andrew, Chiefwei, Cp111:@DreamLiner, Father vice, Ffaarr, Gakmo, Gzdavidwong:@Hamish, Hat600, Htchien, Iokseng, Jasonzhuocn:@Jimmy Xu, Jusjih, KOKUYO, Kalicine730, Kallgan:@Kegns, Kevinhksouth, KirkLU, Koika, Kolyma:@Kuailong, Kuon.Haku, Lakokat, Lanwi1, Liangent:@Manchiu, Minghong, Mongol, Munford, Mys 721tx:@Nbfreeh, Nlu, Outlookxp, Pedist, SElephant:@Shinjiman, Subscriptshoe9, Techyan, Tigerzeng, Wcam:@WhitePhosphorus, Wing, Wong128hk,, Ws227:@Xiplus, Yhz1221, Zy26, 乌拉跨氪, 唐戈:@妙詩人, 春卷柯南, 武藏, 淺藍雪, 滥用过滤器:@燃玉, 瑞丽江的河水, 蟲蟲飛, 霧島聖:. --Liuxinyu970226 (talk) 12:51, 12 March 2021 (UTC)
      • Please note not every sites Wikidata linked are reliable sources. Fandom is not reliable either.--GZWDer (talk) 05:10, 14 March 2021 (UTC)
    •   Support To be clear, I do think high of Baidu and I fully acknowledge all the problems that have been pointed. I also think that linking to Baidu to redirect users to Baidu is not a good perspective... That said, there are datasets out there that are reliable, neutral, open, respectful of copyright, etc. that have links to Baidu. In that case the idea would be to use the Baidu identifiers as pivot tables to map to Wikidata (or a connected dataset). I support this feature for this use case. Eroux108 (talk) 08:13, 17 March 2021 (UTC)
    •   Oppose百度虽然宣称其各频道有其各自相应的著作权政策,但却不远更改其每一个页面下方的“©2021 Baidu”标志,导致其他受到著作权保障的各类文章及多媒体内容,或是原属网络上公有领域、限制部分著作权[例如:知识共享(Creative Commons)或是GNU自由文档许可协议(GFDL)等许可]模式下的自由著作权内容,成为百度的私有财产,或蒙受被解读为百度私有财产的危险。-- 12:03, 19 March 2021 (UTC)
    •   Oppose Most pages of Baidu Baike is weakly maintained, where their leader got only 8 references. SkEy (talk) 17:21, 19 March 2021 (UTC)