Wikidata talk:Wikidata Lexeme Forms/Chinese

Latest comment: 2 years ago by Lucas Werkmeister

@Deryck Chan, Lucas Werkmeister: hello, I'm bumping again in Wikidata:Wikidata Lexeme Forms/Chinese, which is a defacto dead end for "Chinese".
Is there an official place where this Chinese issue have been / is discussed ?
As a former Chinese language PhD I seat on a personal 3,000 items Chinese-French dictionary (likely 5,000 if subdivised per POS as well) with no clear path on where and how to import and to pool this into Wikidata. Those entries have hans, hant, toned pinyin, pos, french translations. Yug (talk) 10:13, 16 December 2021 (UTC)Reply

I considered creating Module:Lexeme-zh based up Module:Lexeme-en and its user script, but I need a "Chinese lexemes" consensus before to unroll my items. I need a green light.
On the module+script matter, the central discussion page is Module talk:Lexeme-en.
AFAIK, there are just about 340 Chinese lexemes on Wikidata. Also, a dataset of 5,000 lexemes could help to create a de facto new norm and consensus for Chinese lexemes and drag the way forward on that language. Yug (talk) 15:21, 16 December 2021 (UTC)Reply
@Yug: Wikidata talk:Lexicographical data might be a better place for this discussion. My point in writing Wikidata:Wikidata Lexeme Forms/Chinese was more about the usefulness of the Wikidata Lexeme Forms tool in particular for Chinese lexemes, not so much about Chinese lexemes in general. If you arrive at a consensus that Chinese lexemes should indeed have more than one form, then I’d be happy to consider adding support to this tool, but last time it seemed like that might not be the case, and that Wikidata Lexeme Forms wasn’t really the right tool for the job. Lucas Werkmeister (talk) 17:43, 18 December 2021 (UTC)Reply
I agree with your "a dataset of 5,000 lexemes could help to create a de facto new norm and consensus". Just do it...! Deryck Chan (talk) 23:59, 18 December 2021 (UTC)Reply
See also Wikidata:Requests_for_permissions/Bot/EnvlhBot_1#EnvlhBot_1 Yug (talk) 11:35, 15 January 2022 (UTC)Reply
See also Wikidata:Lexicographical data/Documentation/Languages/cmn. I will work on a base guideline in coming months. @Deryck Chan, Lucas Werkmeister:.
See also Help:Tabernacle TABernacle (Q26882268) https://tabernacle.toolforge.org/#/ Yug (talk) 13:19, 4 February 2022 (UTC)Reply
@Yug: Did you mean to link to a different EnvlhBot approval discussion? That one is for Breton. Deryck Chan (talk) 15:48, 9 February 2022 (UTC)Reply
@Deryck Chan: Yes. The Breton link is just a thing I monitor, because my case is similar.
Breton had 280 existing entries and ~4,000 entries on personal computer to upload upon a Wikidata.
Mandarin Chinese has ~280 existing entires and ~4,000 Chinese-French entries on my personal computer to upload upon a Wikidata.
Breton and Mandarin Chinese are twin cases, with Breton a bit ahead on documentation, bot and progresses. Yug (talk) 14:04, 10 February 2022 (UTC)Reply
For Mandarin Chinese there are also other considerations to take in before moving forward. The UNIHAN database has I believe about 40,000 Mandarin Chinese-English translations in a "nearly open" license. Should I push my data first, or should we rather lobby UNICODE to truly free its data ? Yug (talk) 13:32, 11 February 2022 (UTC)Reply
Return to the project page "Wikidata Lexeme Forms/Chinese".