Property talk:P11577

Latest comment: 1 year ago by Loominade in topic What does this property mean?

Documentation

said to be the same as lexeme
some source considers this lexeme to be the same lexeme or a spelling variant of another lexeme
[create Create a translatable help page (preferably in English) for this property to be included here]
Allowed entity types are Wikibase lexeme (Q51885771): the property may only be used on a certain entity type (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P11577#Entity types
Symmetric property: if [item A] has this property linked to [item B], then [item B] should also have this property linked to [item A]. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P11577#Symmetric, SPARQL
Scope is as main value (Q54828448): the property must be used by specified way only (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P11577#Scope, SPARQL

What does this property mean? edit

@Loominade, ArthurPSmith, AdamSeattle, BlaueBlüte, Midleading: Since this property has been created with no response to my comment on Wikidata:Property proposal/said to be the same as (lexeme), could someone please answer me here instead? What use is this property when nobody can know what a statement which uses it is supposed to mean? (I listed four different things it could be used to mean in the comment I made and I'm sure there's more). How do we decide that a source actually considers two lexemes to be the same and not, for example, only included under the same headword to save space? - Nikki (talk) 02:11, 13 February 2023 (UTC)Reply

I think every time a dictionary lists two lexemes under the same headword, the source considers them the same. Wikidata only links to online dictionaries which has no space constraint. Asserting that the source actually considers them different looks like original research, unless it is clearly indicated in the source. "Spelling variant" in Wikidata should normally be stored as different forms, so if according to a source they are spelling variants, then they are said to be the same lexeme. Midleading (talk) 02:23, 13 February 2023 (UTC)Reply
We already use "identifier shared with lexeme (P9531) to indicate that things are listed under the same entry, so what information does this property add? By "space" I don't mean storage space, I mean the presentation of the data (and plenty of online dictionaries are digitised versions of older print dictionaries which inherit the structure of the original dictionary anyway). - Nikki (talk) 02:54, 13 February 2023 (UTC)Reply
identifier shared with lexeme (P9531) is indeed very similar to this, but being a qualifier we cannot use it for other sources without an external identifier property. But we can add any sources to this property like reference URL (P854), stated in (P248) and so on. Midleading (talk) 03:06, 13 February 2023 (UTC)Reply
Actually I think this property is more useful than identifier shared with lexeme (P9531). For human users, this property is shown as a big statement rather than hidden under various identifiers as a hard to find small qualifier. For automated users they no longer have to enumerate all identifiers. Most automated users would just use a "wdt" property, regardless of which and how many sources support that statement. Midleading (talk) 06:18, 13 February 2023 (UTC)Reply
identifier shared with lexeme (P9531) does not always mean the source considers the words to be identical. In some cases it's the other way around. For instance jmdictdb lists two homograph words under the same ID, acknowledging they have different etymologies: タクト (L997115) / タクト (L997115). Other sources would give each homonym their own hash id like Hochzeit#1 and Hochzeit#2. If dwds wouldn't offer hash links, we would list them as sharing the same id -- Loominade (talk) 07:49, 13 February 2023 (UTC)Reply

Asserting that the source actually considers them different looks like original research, unless it is clearly indicated in the source.

@Midleading: It is usually clearly indicated, isn't it? -- Loominade (talk) 09:02, 13 February 2023 (UTC)Reply
Thanks for your examples. In these cases homograph lexeme (P5402) should be used, of course. In case P11577 is used with or without homograph lexeme (P5402), then users will be told that Wikidata thinks the two lexemes are the same. The user either just get a quick answer using "wdt:P11577" without caring it, or they can fix it by consulting the references; it's up to the user to decide if they want to clean up Wikidata, same as said to be the same as (P460). The problem with said to be the same as (P460) is clearly there, but not everbody cares that. If we don't have P11577, the user is forced to use "?lexeme ?prop ?statement. ?statement pq:P9531 ?samelexeme." This is much slower. If the user wants to fix it, the user is forced to filter on "?prop", which requires the user to review every possible external identifiers for dictionaries. Midleading (talk) 09:06, 13 February 2023 (UTC)Reply
this property can also tell us about completeness of our data. If a dictionary lists 5 senses for a particular lexeme, but we only have 4 → our data appears to be incomplete. If there is a lexeme connected via said to be the same as lexeme (P11577), we can assume that the lost sense might be hidden there. So we probably shouldn't count those, when assessing completeness. -- Loominade (talk) 08:27, 15 February 2023 (UTC)Reply

Missmatches between DWDS and DUDEN edit

SELECT * WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  ?item wdt:P8376 ?Duden_ID.
  ?item wdt:P9940 ?DWDS_lemma_ID.
  FILTER(!CONTAINS(?Duden_ID, '_'))
  FILTER(CONTAINS(?DWDS_lemma_ID, '#'))
}
Try it!

…for lexemes that DWDS treats as separate lexemes and

SELECT * WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  ?item wdt:P8376 ?Duden_ID.
  ?item wdt:P9940 ?DWDS_lemma_ID.
  FILTER(CONTAINS(?Duden_ID, '_'))
  FILTER(!CONTAINS(?DWDS_lemma_ID, '#'))
}
Try it!

…for lexemes that Duden Online treats as separate lexemes -- Loominade (talk) 15:18, 13 February 2023 (UTC)Reply

These give some false positives. Feel free to fix them if you can. The underscore in the duden id should not count in the beginning or end of an id. --Loominade (talk) 15:26, 13 February 2023 (UTC)Reply

Return to "P11577" page.