Property talk:P11757

Latest comment: 9 months ago by Mahir256 in topic Some notes regarding this property

Documentation

Arabic Ontology form ID
identifier for a word form in the Birzeit University Arabic Ontology database of morphologic entities
[create Create a translatable help page (preferably in English) for this property to be included here]
Allowed entity types are Wikibase form (Q54285143): the property may only be used on a certain entity type (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P11757#Entity types
Format “^[0-9]+$: value must be formatted using this pattern (PCRE syntax). (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P11757#Format, SPARQL
Scope is as main value (Q54828448), as reference (Q54828450): the property must be used by specified way only (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P11757#Scope, SPARQL
Lexeme language: Arabic (Q13955): this property should only be applied to lexemes with these languages (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P11757#language

Some notes regarding this property edit

There are over 15 million word forms in the Arabic Ontology, but many of them include prepositions, conjunctions, or pronominal indicators (whether marking possession or an object) attached to the word in question. لَبَن (L2465), for example, has 783 word forms in the Ontology attached to that lexeme's Arabic Ontology lemma ID (P11038), but that lexeme only has 33 forms on Wikidata. The remaining 750 word forms including those attachments should not be added under any circumstances to that lexeme. Mahir256 (talk) 13:05, 29 April 2023 (UTC)Reply

Thanks a lot for these explanation @Mahir256: ! Could you give a specific example of forms in this database that should not be include in Wikidata Lexemes? To explain that in structured data, Would Wikidata property for an identifier that does not imply notability (Q62589320) be an acceptable value for instance of (P31) for this property? Is there an other and better way? Cheers, VIGNERON (talk) 10:41, 1 May 2023 (UTC)Reply
@VIGNERON: It's been a while, but here are some examples of those forms which should be excluded:
The extra affixes applied to the excluded forms should themselves be forms on other lexemes (such as وَ (L35596) and بِ (L35613)). Right now the forms on the noun and adjective above correspond to the wordforms shown in the English Wiktionary pages for wikt:لبن and wikt:ماهر, though I would have no objection to removing those marked 'definite' if ال (L2288) should be considered similarly to the affixes. (In any case, tacking on the does-not-imply-notability item seems fine for now.) Mahir256 (talk) 21:04, 28 June 2023 (UTC)Reply
@Mahir256 @VIGNERON It is unclear to me why these forms should be excluded. (Or what makes pronominal wordforms invalid in some way that those with case and number affixes are not?) The reasons why they are included in the source database are as applicable on Wikidata as they are there. For scriptural concordance, phonetic annotations, prosodic analysis, and for etymological reasons at least some of these forms are necessary to include. I would not hesitate to add a form like وَالصَّابِئُونَ for example.
The Arabic Ontology database is also not indiscriminate in what it includes, and for example, obscure dual forms will only be noted on the lemma entry without individual form entries for the case endings like مَايَانِ (L8228-F4). عُثمان (talk) 21:48, 28 June 2023 (UTC)Reply
Same comment, my arabic is not good and I would be really useful to have these forms as they are not trivial (and if I'm not mistaken, most of them can't be constructed automatically, or not easily). In any case, it's important and it would make contributing easier if we have some clear rules about inclusion and the reason behind. Cheers, VIGNERON (talk) 10:09, 1 July 2023 (UTC)Reply
@عُثمان: Yes, you do make a good point regarding scriptural concordance and etymologies of words in other languages, and for these I would support composed forms' inclusion. (I am inclined to recognize the form selectivity of the Arabic Ontology as well, though it would be good to see how this selectivity affects/is guided by other parts of the website.) Mahir256 (talk) 15:09, 8 July 2023 (UTC)Reply
Return to "P11757" page.