Logo of Wikidata

Welcome to Wikidata, نعم البدل!

Wikidata is a free knowledge base that you can edit! It can be read and edited by humans and machines alike and you can go to any item page now and add to this ever-growing database!

Need some help getting started? Here are some pages you can familiarize yourself with:

  • Introduction – An introduction to the project.
  • Wikidata tours – Interactive tutorials to show you how Wikidata works.
  • Community portal – The portal for community members.
  • User options – including the 'Babel' extension, to set your language preferences.
  • Contents – The main help page for editing and using the site.
  • Project chat – Discussions about the project.
  • Tools – A collection of user-developed tools to allow for easier completion of some tasks.

Please remember to sign your messages on talk pages by typing four tildes (~~~~); this will automatically insert your username and the date.

If you have any questions, don't hesitate to ask on Project chat. If you want to try out editing, you can use the sandbox to try. Once again, welcome, and I hope you quickly feel comfortable here, and become an active editor for Wikidata.

Best regards! Neo-Jay (talk) 17:29, 15 July 2022 (UTC)Reply[reply]

Removing certain script representations from lexemes edit

Please do not remove Arabic-script representations from Hindustani lexemes, even if the lexemes they apply to are clearly of a direct Sanskrit origin. In most cases, the components of these words (if not the words themselves) have attestible Arabic-script representations owing to the sources provided on them; if this turns out not to be the case for a particular word (component), then just inform us and we can try to source one for you. Mahir256 (talk) 00:11, 1 February 2023 (UTC)Reply[reply]

@Mahir256: Hi, I hope you're well! I have no issues that the lexemes are of Sanskrit origin. I was removing them because they were quite clearly just direct transliterations of the Hindi lemmas (and for that matter, with spelling mistakes as well in some) - which I see has been reverted by the original user for alleged "vandalism". If the Urdu lexemes can be properly attested, then as I say, I have no issues with them being reinstated. However, mere transliterations will not suffice. نعم البدل (talk) 01:14, 1 February 2023 (UTC)Reply[reply]
With respect to कुष्ठ, for example, the lemma given in Platts is in fact کشٹھ (with a ٿ), so it was appropriate to keep that as the ur lemma there. Similarly the lexemes for शोध and कर्ता are attestible in their Arabic-script representations (making them useful as a source for the components of शोधकर्ता), as is उपसर्ग (also given in Platts in the Arabic script). I can understand why the other user might consider your edits vandalism; a blanket statement 'mere transliterations will not suffice' seems like a shaky ground for such removals as you have made in the faces of sources for how they are written. Mahir256 (talk) 01:34, 1 February 2023 (UTC)Reply[reply]
@Mahir256: Platts is an outdated dictionary, there are lots of similar lexemes supposedly attested in Platts, but not in any other modern dictionaries or books in practice. Native Urdu speakers wouldn't even understand these terms. I decided to reach out to the other user previously to discuss some issues, but unfortunately they decided to be very rude with me. If you can read Punjabi (in Shahmukhi), then you might understand the exchange. نعم البدل (talk) 01:40, 1 February 2023 (UTC)Reply[reply]
I will agree that using it as the only source of meanings is unsuitable (I try to ensure that a word is also defined in some other non-English source before using it), but it's not as if that dictionary should entirely be discounted as a consequence; other information about the words within it is useful. With respect to 'native Urdu speakers', what is spoken by the successor states of the British Raj did not simply snap into existence on that midnight in August 1947; people were speaking it sixty-three years prior, and unless you can demonstrate that Platts was bribed/compelled/etc. by Sanskritists to unnaturally include certain types of words in that dictionary, records like that can help fill some gaps of a non-semantic nature. Mahir256 (talk) 01:49, 1 February 2023 (UTC)Reply[reply]
You make a fair point. Platts shouldn't be simply discarded – and I'm not trying to say that it's worthless or anything, I use it on a regular basis too, but it doesn't make sense to label these types of lemmas 'Urdu' if they can't be attested. Google Books includes Urdu books from as early as the 1800's, yet you won't find these lemmas in any of them. Wiktionary has a good policy when it comes to attestation - which requires 3 citations, and I'm sure it would be quite difficult to satisfy this. I'm not even asking for 3, if you can find this lemma in an independent and reputable source, aside platts, then I'm happy to keep them. نعم البدل (talk) 02:01, 1 February 2023 (UTC)Reply[reply]
That it is difficult to find attestations is certainly true, but it is not impossible, and so long as it is not impossible it should not be discouraged through the removal of lemmata in one script and not another. There are multiple languages with diverse vocabulary origins (and whose speakers may tend to prefer one origin or another for their own speaking and writing habits) whose lexemes are nevertheless presented in multiple scripts for uniformity, even if certain words are more heavily found in one script over another; languages split by the Radcliffe Line do not have to be exceptions to this trend. Have a look at the usage examples I found for a clearly Sanskritic word; if you want to attack the sources I used for these quotations, do this elsewhere, such as at w:WP:RSN. Mahir256 (talk) 02:31, 1 February 2023 (UTC)Reply[reply]
@Mahir256: I'm not sure why you felt the need to mention "clearly Sanskritic". I did make it clear that I'm not basing this entirely on etymology - because I would have the same issue if it was a direct, unattested Arabic borrowing or similar. Again, I'm not attacking anything – all I'm saying is that it should be attested. The references you gave for आदिवासी/آدیواسی (L678809) are perfectly fine. However, using sources like Q111311147, to attest the Urdu forms, like it was done at शोध/شودھ (L1010839), when the source itself states that the part being cited is infact a transliteration ("اس کے ہر اندراجی لفظ پر صحت و احتیاط کے ساتھ اعراب لگائے گئے ہیں تاکہ ہندی الفاظ کا صحیح تلفظ معلوم ہو سکے۔") of the Hindi lexeme, isn't fine. نعم البدل (talk) 02:44, 1 February 2023 (UTC)Reply[reply]
@Mahir256: Also, for कुष्ठ/کُشٹھ (L684522) - you said in your edit summary that it was attested in Platts with the "ٿ" letter and mentioned that ٹ is the "modern version" even though ٹ was definitely used in the same era. 'کشٿهہ' - the given word in Plattes is quite different to 'کشٹھ'.
@Mahir256: And what logic is this even [1], that's just laughable – शोधकर्ता/شودھکرتا (L680210), if I coin the term "degazingless" on the basis that each part of the word is testified, could I add it to Wikidata? نعم البدل (talk) 03:05, 1 February 2023 (UTC)Reply[reply]
(double edit conflict) If in fact you're not basing this on etymology, then find me a Hindustani lexeme which is in fact a "direct, unattested Arabic borrowing or similar" that has enough of a problem that removing the lemma on it is such a necessity: thus far the lexemes which you have made efforts to remove information on are all indeed Sanskritic. I'm glad to hear that the sources for the lexeme I added sources to are fine; you do make a point re: some of those inter-variety dictionaries, and while I can accept them not being cited for lemmata, I still stand by my point re: discouraging searching for sources (how can you know what it is to search for if you don't know what it is?). I won't try to defend the mere existence of शोधकर्ता (a word which in DSAL can only be found in McGregor's dictionary and not the Shabdsagar—such a defense could be @Vis M:'s job), but as long as the lexeme exists I will defend its having an Arabic script lemma that can be readily deduced from the written representations of its parts. Speaking of which, as for the 'begazingless' analogy, my edit summary was intended to characterize not the word itself but only the representations of its parts in the Arabic script—(if a individual from Tabriz needed to spell out a compound Azerbaijani word for their compatriots, but this word was only used in Baku, even as the two parts of that word were in fact in common use in both Latin and Arabic scripts, the use of the spellings of those two parts would in my view be an entirely acceptable expression of effort)—although if in fact someone did in fact seriously use 'begazingless' as a word and this use could be provided as a source, I would not mind having a lexeme for it. Mahir256 (talk) 03:17, 1 February 2023 (UTC)Reply[reply]
@Mahir256: Apologies for the double conflict:
  • then find me a Hindustani lexeme which is in fact a "direct, unattested Arabic borrowing or similar" that has enough of a problem that removing the lemma on it is such a necessity – errr, what sorry? I found these lexemes by scrutinising another user's edits, not by going through every single "Hindustani" lexeme on Wikidata. As I explained earlier, I had concerns already with another user. I wasn't even trying to understand the etymology. If there does happen to be an Urdu lexeme on Wikidata, of Arabic origin that I object to, I will let you know, but so far it is a coincident that these terms are Sanskritic, but realistically speaking – why should that even be surprising? Hindi has absorbed a lot more Sanskrit learned-borrowings than Urdu.
I mean, I'll give you a recent, similar example from Wiktionary, if you really wish.
  • how can you know what it is to search for if you don't know what it is. example sentences? The definitions or context surrounding the term? Sometimes even searching the word alone on Google books will yield good results, thanks to Unicode. And yeah, I wouldn't object to 'degazingless' (which I mistyped as 'begazingless', works nonetheless lol) either as long as it was attested. Covfefe became widespread, even if it was only for a short period of time.
  • but this word was only used in Baku, even as the two parts of that word were in fact in common use in both Latin and Arabic scripts – I know this is meant to be an example hinting that Hindi and Urdu are the same language but the problem with this example is that the Azerbaijani doesn't have various standard forms, with different scripts. I'll say it again, if you asked a native Urdu speaker what such terms meant (ones that can only be attested in Platts for instance), especially if they've had limited exposure to Hindi, regardless of whether they were born in Hyderabad, Deccan or Karachi, Pakistan, they wouldn't be able to tell you to exact meaning. نعم البدل (talk) 03:53, 1 February 2023 (UTC)Reply[reply]
Nevertheless, I do appreciate you providing the citations for some of the other terms. نعم البدل (talk) 03:53, 1 February 2023 (UTC)Reply[reply]

erroneous P1476 edit

Please remove erroneous title (P1476) from Pakistan places. Kethyga (talk) 05:36, 15 March 2023 (UTC)Reply[reply]

Hi @Kethyga: My apologies. This was done automatically with the Wikidata extension. I'll remove them. نعم البدل (talk) 05:47, 15 March 2023 (UTC)Reply[reply]