Wikidata:Property proposal/Baidu Baike ID

Baidu Baike IDEdit

Originally proposed at Wikidata:Property proposal/Authority control

   Not done
DescriptionEntry in Baidu Baike encyclopedia
RepresentsBaidu Baike (Q803722)
Data typeExternal identifier
Allowed valuesstring
External linksUse in sister projects: [ar][de][en][es][fr][he][it][ja][ko][nl][pl][pt][ru][sv][vi][zh][commons][species][wd].
Number of IDs in source15,379,979
Expected completenessalways incomplete (Q21873886)
Formatter URL$1
See also
  • Wikidata:Property_proposal/Archive/34#Baidu_Baike_entry
  • Motivation

    Baike to me seems like the "default" site to go for Chinese wiki-like things. It's ~15x the size of the Chinese wikipedia and for example there are a lot of Movies/TV shows that never make it to IMDB but would have Baike entries. Singers, actors etc. Seems silly to not have it as it could be used on a lot of items on wikidata. Wikidata also wants to be multilingual, adding major language specific sources for data points wouldn't hurt.

    I'm basically just learning to use wikidata mostly editing entries related to one entity so far and seeing how things work.
    One of the first things I tried to add was a Baike ID and the lack of it was glaring. ShadessKB (talk) 18:09, 22 May 2018 (UTC)


    To call Baike a mirror of wikipedia is inaccurate as it has huge amounts of content not on any wikipedias. I also don't see how an undefined fraction of translated/copied content would be a reason to not add this. Twitter, facebook, youtube etc. are full of various content copied without permission and they have IDs on wikidata. If Wikimedia or whoever wants to sue Baidu of something that'd be a separate issue not related to an adding of a property here. I see a huge source currently not being able to be utilized while much lesser ones have been added. This should be a slam dunk.
    Thanks for fixing the external-id bit. ShadessKB (talk) 23:39, 22 May 2018 (UTC)
      Support 15 million id's is way more than any language wikipedia has. I checked the first linked example, it certainly looks quite different from enwiki. If there are copyright violations, having the id's linked would make it easier to programmatically find them, so I think this is a good idea whether or not that's an issue. ArthurPSmith (talk) 19:30, 23 May 2018 (UTC)
      Oppose I did not know the editing policies of the website. As @虹易: states, looks like advertisement and paid editing is rampant on the site. Changing my vote to not include this. I would also like to ask what is the equivalent of Wikipedia in China if not Baidu Baike? Gotitbro (talk) 09:41, 29 May 2018 (UTC) and Baidu Baike are the major ones I believe, I am not Chinese though.ShadessKB (talk) 12:04, 29 May 2018 (UTC)
    •   Support --Eroux108 (talk) 10:15, 27 May 2018 (UTC)
    •   Support Although Baidu Baike does have certain bad reputation, I have to say that it is a good identifier to add to Wikidata so that a major related source can be added to items. One thing to add that for articles related to Chinese cultures and history, Baidu Baike has significantly higher quality than Chinese Wikipedia. Also, theoretically Baidu Baike also requires editor to provide references and citations in articles (though it was not often practiced), so editors can also find more sources and references there. --Efly (talk) 00:33, 29 May 2018 (UTC)
    •   Neutral. I just know the bad reputation of Baidu.--Jusjih (talk) 01:47, 29 May 2018 (UTC)
    •   Oppose per Pigsonthewing. No open content, lots of copyvio--Shizhao (talk) 02:20, 29 May 2018 (UTC)
    •   Neutral. I don't mind adding it for collection, but I care about how to make and use it.  – The preceding unsigned comment was added by YFdyh000 (talk • contribs) at 02:43, 29 May 2018‎ (UTC).
    •   Oppose per Pigsonthewing. We do not support copyvios here. --Zhuyifei1999 (talk) 03:16, 29 May 2018 (UTC)
    •   Strong oppose about such proposal, we should not have a link to a site which has really lot's of copyvio contents. --Stang 03:28, 29 May 2018 (UTC)
    I just want to re-iterate that wikidata links to youtube, twitter etc. which are all full of copyvio content.ShadessKB (talk) 11:54, 29 May 2018 (UTC)
    •   Oppose, I have the same opinion as Stang. --Junjie Yuan (talk) 03:45, 29 May 2018 (UTC)
    •   Oppose, beside the copyvio, the site also content a lot of advertisement articles. It's not a good reference source for Wikiproject. --白布飘扬 (talk) 05:44, 29 May 2018 (UTC)
    •   Oppose. Besides those aforementioned, it is full of *misleading materials* regarding almost every fields. And I guess less than 5 percent "articles" there do include 1 or more sources or citations. As a widely acknowledged fact, although it claim that it never charges for editing, a lot of commerce-related articles are treated specially. Last but not least, the examples the proposer gives (i.e. Movies/TV and singers) is a bit of ironic. Most of such articles are created by the companies related to them. Those companies usually create these articles for advertisement instead of knowledge and hence these articles often have little, if any, neutrality. 虹易 (talk) 08:45, 29 May 2018 (UTC)
    •   Oppose is the only suitable vote. Baidu Baike is not a member of Wikimedia Foundation, that it is an external links for any wiki under Wikimedia Foundation, so it can never be accepted. ŚÆŊŠĀ 08:57, 29 May 2018 (UTC)
    •   Oppose. Reputation and political position of this site can't be accepted. And it has, sometimes, low quality. --Blissghost (talk) 09:10, 29 May 2018 (UTC)
    •   Oppose. These IDs aren't stable (they're like Wikipedia page names) and there is ongoing copyright violation problems with Baidu plagiarizing Wikipedia. Deryck Chan (talk) 10:09, 29 May 2018 (UTC)
    •   Support. Copyvio issues could be more easily resolved if we can map the issue. Also since the impact of it in the Chinese speaking world, I think it would be better to be included for providing more perspectives of the cultural context of the region. --Shangkuanlc (talk) 10:24, 29 May 2018 (UTC)
    Hope whoever makes the decision in the end takes into account the quality of some of the opposing votes arguments here.
    -there is no requirement of any site/service being a member of Wikimedia Foundation for external id's.
    -political positions of a site shouldn't much matter either.
    -ID's being like wikipedia (meaning they could change), a lot of external id's can be changed (twitter, facebook etc.).
    There also are entries on wikidata already that link to through reference URLS, which is just a less structured way linking to a site that would allow for a more structured way. If (potential) copyvio content is a problem then aren't references just as big of a problem? Some of these arguments against, at least to me, don't seem to make a lot of sense. Baidu Baike has structured data available and wikidata is all about that. Adding a link to baike entries isn't adding, endorsing or accepting copyvio content any more than it is by linking to youtube, twitter, facebook or other sites wikidata already links to.ShadessKB (talk) 11:52, 29 May 2018 (UTC)
    •   Neutral Baidu Baike has a numeric ID that is probably more stable than the texual one; and let's face it you don't really expect anything on Baidu Baike, especially article names, to change a lot. Pages are more likely to be deleted than renamed. Using the numeric ID also circumvents the problem of having to deal with disambiguation, as Baidu implements it as multiple page IDs under the same name or something like that. Anyway, I object to the current "string" assignment, but remain open to this whole property thing. I know Baidu Baike is shitty, but well yeah it does sort of provide a cross-referenced ID and is used a lot by folks who can't visit zh.wp… If we have a property for some other online encyclopedia, then perhaps -- perhaps we should consider Baidu. --Artoria2e5 (talk) 11:59, 29 May 2018 (UTC)
    Yeah they do have numeric ID's also. I just couldn't figure out a working formatter url to link to them. If there is a way then that should be the formatter url. As far as "Allowed values" if it's incorrect now as "string" I hope someone can fix it.ShadessKB (talk) 12:09, 29 May 2018 (UTC)
    As Baidu Baike has abandoned numeric ID, the numeric ID is really stable for us? It may only be traversal purposes for us, and traversing by robots may not be appropriate. Its traditional URL looks like, now redirected to /item/articlename/subid. Baidu Baike only provides API for approved companies, although an unencrypted appid is known on web. --YFdyh000 (talk) 17:41, 29 May 2018 (UTC)
    • (edit conflict)   Neutral Baidu Baike has a numeric ID that is probably more stable than the texual one. The problem cannot be avoided, however, that there may be multiple entrires which contains duplicate content, since there is no something like English Wikipedia's CSD A10 in Baidu Baike. So, an Wikidata item might have to match many Baidu Baike entries. COI: Since I was an intermediate Baidu Baike editor before I joined Wikimedia, so I am really familiar with this situation. --WQL (talk) 14:13, 29 May 2018 (UTC)
    •   Strong oppose: Can someone tell me how this gets to here: we are still dealing with Copyvio of zhwp to Baidu Baike.--1233 (talk) 14:12, 29 May 2018 (UTC)
    •   Oppose but not because of copyvio problem, my consern is that how their users trust Wikimedia projects, and FWIW they always do things that violate our Trademark policy. --Liuxinyu970226 (talk) 05:51, 31 May 2018 (UTC)
    •   Oppose I have the same opinion as Sanmosa. In addition, it is easy to find the corresponding entry in Baidu Baike using the same pagename. --Brror (talk) 09:36, 31 May 2018 (UTC)
    I think I can understand why there are so many oppositions. Without a doubt, it is a big database with its own created content, however, in my point of view, Baidu Baike is not friendly with (or even vicious to) the Chinese community. Compare to the similar situation such as Faceobook, youtube, or other big content carriers, the difference here is we as a Wikimedia movement cannot find anyone there in Baidu Baike's side to discuss the copyvio and other important issues on the table. That contradicts the community culture and make a lot of long-term Wikipedians oppose this proposal. --Shangkuanlc (talk) 01:06, 1 June 2018 (UTC)
    •   Support Recording BAIDU ID does not represent display on WIKIprojects or support them. WIKIDATA purpose is to record the connections between things. I think the connection between BAIDU and WIKI is an objective reality. Why we can't do it? Just becouse of the view ? 乌拉跨氪 (talk) 18:51, 6 June 2018 (UTC)
    •   Oppose Baidu Baike has a bad reputation alway , and any question ,example there are many similar article lack of improve,The most important thing Baidu Baike plagiarizing or copy any other encyclopedia.--Zest (talk) 13:30, 23 June 2018 (UTC)
    •   Support This may help Wikidata to cover most of the items in China too.*angys* (talk) 05:03, 10 May 2019 (UTC)
    •   Support Linking it does not mean endorsing the project. Mapping Baidu Baike IDs can be useful for many projects. --MarioGom (talk) 12:29, 6 March 2020 (UTC)
      @MarioGom: This is an old proposal that already rejected for 2 years, why bumping here? Anyway, can't you find other resources than Baidu Republic of Copyvioing site? --Liuxinyu970226 (talk) 14:33, 24 May 2020 (UTC)