Wikidata:Property proposal/description
Description
editOriginally proposed at Wikidata:Property proposal/Generic
Description | (aliases: Definition, Abstract, Summary, Curatorial comment, Biographic note): a well-edited short description of the item, with mandatory reference. Keep it short (a couple of paragraphs): Wikidata is not a full-text repository, and respect others' copyright. |
---|---|
Represents | definition (Q101072), description (Q1200750), abstract (Q333291), summary (Q776754) |
Data type | Monolingual text |
Domain | any entity |
Allowed values | Reuse some great regexps from title (P1476) that warn on Latex chars, HTML tags, etc |
Example 1 | rocking chair (Q14963) →
|
Example 2 | The Virgin Cataphyge (Refuge) and St. John the Evangelist (Q84545297) →
language Bulgarian, reference reference URL (P854) http://bidl.cc.bas.bg/viewobject.php?id=1&lang=bg language English, reference reference URL (P854) http://bidl.cc.bas.bg/viewobject.php?id=1&lang=en |
Example 3 | Mother of God Pantonhara (Q84296272) →
language: English, reference described by source (P1343) The icon of the Mother of God Pantonhara in the Icon Gallery (Q84291564); Academia.edu publication ID (P7896) 9843052 |
Source |
|
See also | title (P1476), name (P2561), official name (P1448), native label (P1705), subtitle (P1680), media legend (P2096), currency symbol description (P489) |
Motivation
editUnlike WD props, the labels and descriptions on top don't have provenance. Furthermore, descriptions are intended to be short and used only for disambiguation, and don't have any quality control. To compensate, we have title (P1476), name (P2561) (and a whole slew of sub-props or variants like official name (P1448), native label (P1705), subtitle (P1680), media legend (P2096), currency symbol description (P489)) for labels. But there is no similar prop for Descriptions.
- I propose one prop for several different things for now (Definition, Abstract, Biographic note) and we can specialize such props later: what do you think?
- In many cases it's important to respect newlines, how could that be supported?
- Maybe we could even use this for Song lyrics or Poem text? But I think that's going too far and we should have a separate prop for this. Should I propose one? (Moebeus expressed copyright concerns, so I'm dropping this idea, even though there are MANY lyrics sites)
Vladimir Alexiev (talk) 09:36, 7 February 2020 (UTC)
Discussion
editSupport Makes sense to me. WD description fields are currently being used in many ways that are not compatible with "front-facing" display on Wikipedia etc. A dedicated property would help with that, querying would be easier, constraints could be written, etc. Moebeus (talk) 10:04, 7 February 2020 (UTC)
- Support --Crowjane7 (talk) 15:55, 7 February 2020 (UTC)
- Support --Brimwats (talk) 08:33, 8 February 2020 (UTC) -- as libraries, archives, galleries, and museums draw upon and use Wikibase more (such as Project Passage) the description text becomes even more important and essential for use.
- If another organization uses Wikibase and needs a certain property they can create it on their end. They don't need Wikidata to create properties. ChristianKl ❪✉❫ 18:54, 10 February 2020 (UTC)
- Support --illipmich (talk) 17:18, 7 February 2020 (UTC)
- Oppose primarily for the overt generality of this property. At least with something like scope and content (P7535) there are some constraints on its use and how it is structured, where here no such limitations on this proposed property's use exist. With respect to the front-facing aspect, Wikidata descriptions should in fact be usable as something that can be exposed to an end-user; the stuff useful only to a Wikidata editor should be placed in a separate Wikidata property (à la Wikidata usage instructions (P2559)) but the proper display of this property within the interface is presently blocked for technical reasons. Mahir256 (talk) 18:23, 7 February 2020 (UTC)
- Oppose I don't believe unstructured text belongs in Wikidata. We have had this conversation before and I can't remember precisely in what context - might have been "credit line". The sample text reads very much like the text type & length that used to be acceptable as a Wikipedia stub. That Wikipedia now has minimal length requirements is not a reason to put stubby text snippets in Wikidata. Also, such text descriptions of objects are generally copyrighted, if they are not from some 100-year-old catalog. Jane023 (talk) 19:22, 7 February 2020 (UTC)
- Oppose: unrelated with structured data. Nomen ad hoc (talk) 21:14, 7 February 2020 (UTC).
- Oppose inherently unstructured data. If it’s a block quote then it’s copyright problematic (certainly not cc0-pure like wikidata likes to keep things. Longform freetext properties seem to be the opposite of what wikidata’s about, I would have thought. Wittylama (talk) 22:48, 7 February 2020 (UTC)
- As the proposed description says "respect others' copyright". It's no more copyright problematic than any other info. Eg Nomenclature (one of the examples) is released under appropriate copyright in its entirety, labels and descriptions alike. I challenge everyone to give examples of datasets where "data" is cc0 but descriptions are not.
- @Wittylama: I am surprised that you as a GLAM person don't see the value of such a prop for GLAMs.
- WD has many items that will probably never get a WP page. Such items will be the poorer if one can't record a rich description. Eg CHIN want to add rich descriptions to the 15k Nomenclature objects...
- Many GLAM projects want to use WD as an integration platform. Eg we plan to import 500 Orthodox icons and enrich data about painters, saints, monasteries etc, see http://rawgit2.com/VladimirAlexiev/my/master/pres/20200130-Wikidata-Icons/Slides.html. But without rich description it makes little sense to describe icons. --Vladimir Alexiev (talk) 19:42, 8 February 2020 (UTC)
- I can and do see the value of rich, contextual, nuanced, descriptive information (especially for glams). However, I am saying that paragraph-length Freetext is not structured data, and therefore isn’t within the scope of Wikidata. Moreover, lengthy quotes (such as in the examples given above) are more than trivial because they represent the entire text professionally-written description and therefore are copyrighted information. This is different from information like ‘height’ or ‘year’ or ‘creator’ which are simple facts an therefore non-copyrightable. Thus, this property could only be used when a glam has proactively decided to share ALL fields of their their collection records under CCO - this is not information that can legally be scraped from databases and republished by us unless they’ve released it.
- @Wittylama: No, you cannot scrape somebody's database just because you believe "simple facts are non-copyrightable". Collections of such facts are very much copyrightable and people pay big bucks to obtain appropriate datasets. Descriptions are no more copyright-problematic than other fields --Vladimir Alexiev (talk) 11:17, 11 February 2020 (UTC)
- I can and do see the value of rich, contextual, nuanced, descriptive information (especially for glams). However, I am saying that paragraph-length Freetext is not structured data, and therefore isn’t within the scope of Wikidata. Moreover, lengthy quotes (such as in the examples given above) are more than trivial because they represent the entire text professionally-written description and therefore are copyrighted information. This is different from information like ‘height’ or ‘year’ or ‘creator’ which are simple facts an therefore non-copyrightable. Thus, this property could only be used when a glam has proactively decided to share ALL fields of their their collection records under CCO - this is not information that can legally be scraped from databases and republished by us unless they’ve released it.
- We have had this discussion before with glam concepts like “credit line”. It is very useful and important information about objects in glam collections, BUT “credit lines” are unstructured data/freetext and potentially coyrightable, and therefore we do not [yet] have a solution for how to handle them in wikidata.
- Instead of republishing the full text of the description, could you instead use a property like this to link/reference to where the full description can be found (URL or published catalogue), when such a description exists? Wittylama (talk) 09:12, 9 February 2020 (UTC)
- @Wittylama: Sure we can and will, but that's no substitute for having a decent description on the item. --Vladimir Alexiev (talk) 11:17, 11 February 2020 (UTC)
- Comment I suppose we needed something like this to avoid showing bias against anything that is not archive related (given that we do have scope and content (P7535)). --- Jura 09:56, 8 February 2020 (UTC)
- Comment I think datatype should be monolingual string. Multilingual isn't currently in the works. --- Jura 09:57, 8 February 2020 (UTC)
- Support --eroux108 (talk) 16:53, 8 February 2020 (UTC)
- Oppose. The point of Wikidata is to collect structured data, which this isn't. --Yair rand (talk) 23:55, 9 February 2020 (UTC)
- Oppose The motivation states for disambiguation purposes. But we already have "description" and its Help page explains why we need to use the minimum amount of text to disambiguate. Thadguidry (talk) 17:02, 10 February 2020 (UTC)
- hi @Thadguidry: I think you've misread the motivation: "descriptions (at the beginning) are intended to be short and used only for disambiguation, and don't have any quality control". This one is for an authorized/editorial description --Vladimir Alexiev (talk) 11:12, 11 February 2020 (UTC)
- Oppose This will cause more trouble than it's worth. Moderating would get a lot more complicated because of all the possible copyright violations, and it will be very unclear how this property relates to Wikipedia articles. There are lots of potentials for useless duplication of content, and newcomers will be confused by having both a description field and a description property. Note that just for 'paragraph-length' summaries of Wikipedia articles there is already an excellent API method. Husky (talk) 18:16, 10 February 2020 (UTC)
- @Husky: Why people don't get confused by props "name" & "title" vs "label" on top? Wikipedia abstracts are ok, but about 2/3 of WD items have no WP article --Vladimir Alexiev (talk) 11:12, 11 February 2020 (UTC)
- Oppose To me even the examples look copyright violating. The Bulgarian website doesn't have any statement that suggests that their content is in the public domain. ChristianKl ❪✉❫ 18:54, 10 February 2020 (UTC)
- @ChristianKl: I assure you that I have spoken to the creators of the site (Institute of Mathematics and Informatics of the Bulgarian Academy of Sciences), and we'll be importing these icons to Wikidata. Do you only doubt the descriptions, or all data on the site? That's my point: a longer text is no more inherently copyright-problematic than other data --Vladimir Alexiev (talk) 11:12, 11 February 2020 (UTC)
- Statements about who created an icon or when it was created are factual statements that are not subject to copyright. Nothing on the example item The Virgin Cataphyge (Refuge) and St. John the Evangelist (Q84545297) seems to me like it's protected by copyright. On the other hand there's creative work in a description that is protected by copyright.
- @ChristianKl: I assure you that I have spoken to the creators of the site (Institute of Mathematics and Informatics of the Bulgarian Academy of Sciences), and we'll be importing these icons to Wikidata. Do you only doubt the descriptions, or all data on the site? That's my point: a longer text is no more inherently copyright-problematic than other data --Vladimir Alexiev (talk) 11:12, 11 February 2020 (UTC)
- There's the issue of EU database right given that Bulgaria is an EU country but that's a separate issue from the copyright. ChristianKl ❪✉❫ 13:55, 11 February 2020 (UTC)
- Oppose Unstructured data, no need to duplicate Wikipedia(s).--Jklamo (talk) 19:13, 10 February 2020 (UTC)
- @Jklamo: about 2/3 of WD items have no WP article, that includes a lot of GLAM objects, this prop is intended for them. (Similarly, about half of the 15k object descriptions in Nomenclature have no WP article or WD item: we're creating them https://tools.wmflabs.org/mix-n-match/#/catalog/3270 and many of them are not likely to ever get a WP article. Eg how many of the 100 or so types of chair https://www.nomenclature.info/parcourir-browse.app?lang=en&id=1090&wo=I&ws=INT do you think will get a WP article? --Vladimir Alexiev (talk) 11:12, 11 February 2020 (UTC)
I see this one will get shot down, so let me explain what will happen:
- We will import icon descriptions in the existing poor "description" fields (without provenance), because 99% of these will be new WD items.
- However, we won't be able to import Nomenclature descriptions in the poor "description" fields because many of these are existing WD items, and there's only 1 slot per language, and we can't be sure we won't overwrite information in that slot, no matter how crappy the existing description may be. In fact even for newly created Nomenclature items, we already make poor descriptions eg exterior shutter (Q80794411) has the ancestor path "(Category 01: Built Environment Objects> Building Components> Door & Window Elements> Window Element)" in lieu of a proper description (enough for disambiguating, but not satisfactory). When Nomenclature are ready with their proper editorial description (a big effort on their part), we won't be able to refresh the crappy descriptions in WD items --Vladimir Alexiev (talk) 11:24, 11 February 2020 (UTC)
- "(Category 01: Built Environment Objects> Building Components> Door & Window Elements> Window Element)" isn't a valid description. Why create it in the first place? --Yair rand (talk) 22:46, 11 February 2020 (UTC)
- @Yair rand: It serves the purpose to disambiguate and is better than nothing, so why do you think it's invalid? Rather than critique, why didn't YOU make a better description of exterior shutter (Q80794411) if you dislike this one? --Vladimir Alexiev (talk) 15:51, 12 February 2020 (UTC)
- It's not a valid description under WD:D. I think it would be far preferable to leave it blank, and have it be easily discoverable as a descriptionless item, than to have a non-description taking up the field. I'd recommend deleting all such descriptions. --Yair rand (talk) 06:05, 17 February 2020 (UTC)
I just found out that the poor "description" field has a limit of 250 chars. This means we can't import useful descriptions from GLAM datasets like Nomenclature (@Crowjane7:) or icons. Cheers! --Vladimir Alexiev (talk) 13:39, 13 February 2020 (UTC)
- Comment I think a more narrowly focused proposal with limited domain or source, similar to the existing property scope and content (P7535), would be more likely to succeed. ArthurPSmith (talk) 18:01, 13 February 2020 (UTC)
- Leaning oppose while I understand the rationale behind this 1)This isn't really strucruder data 2) I am concerned about licencing issues and 3) I think that as proposed it has a bit too broad scope and is too easy to be misused, leading to more trouble. --Kostas20142 (talk) 22:42, 16 February 2020 (UTC)
- Not done, no consensus.--GZWDer (talk) 20:31, 18 March 2020 (UTC)
- Stating that it is quite usual to encounter individuals from a class (web ontology parallel) having a "description" field: even wikidata entities have a "heading-description" that is a short text; it is quite stunning that we cannot find in wikidata a property with "description" as label that would simply be used to precise that an item can have a "textual description" ... painting (Q3305213) have depicts (P180) that basically corresponds to a description, but what about a service (Q7406919), a concept (Q151885)?... maybe a different property can be used: what would be the corresponding property for the heading-description of wikidata entities?
- The advantage of a "description" property would be to be self explanatory and general enough to be widely used among wikidata entities.
- BTW "description" is a very basic property of Thing (the most generic item) in schema.org, cf: https://schema.org/Thing & https://schema.org/description
- NB: a "description" property could logically be a super-property of depicts (P180) which is currently a subproperty of has part(s) (P527) D3fk dev (talk) 14:43, 27 July 2022 (UTC)
- @D3fk dev: Every item has a multilingual description field at the top level, just like labels and aliases. Why is something more needed? Note also that using a property for this has been discussed here before. ArthurPSmith (talk) 16:25, 27 July 2022 (UTC)
- The point of Wikidata is having structured descriptions of concepts, not textual ones. Silver hr (talk) 16:45, 27 July 2022 (UTC)
- @Silver hr: Actually the point here is not to add a description ahead of all item pages of wikidata, as you said the multilingual description is sufficient, but simply to expose as a structured information the fact that some individuals from certain classes (instances of these classes) need a description to be complete entities:
- I exposed as an example previously that service (Q7406919) should be completed with the adding of a "description" property(or similar) because having an instance of a service (Q7406919) that will not have a description of the service proposed will rather be resumed to a title of a service and not a comprehensive service entity: the user of such a service might not understand what will be performed by using the service without a description of the service.... would be the same for many other entities/instances of entities subclasses of concept (Q151885).
- You can make a parallel with the painting (Q3305213) that have depicts (P180)(free text) that basically corresponds to a "description" property of paintings and that is required to make an instance of a painting entity comprehensive... we cannot use depicts (P180) for a service (Q7406919) or any other concept (Q151885)
- On an other point it seems that the multilingual description of each item is currently a simple field but in a more structured way should have logically been a property of each Wikidata entity (Q32753077) or Wikidata item (Q16222597) (as a "description" property?)... but that might implies that each wikidata item page would have been considered as instances of Wikidata entity (Q32753077) or Wikidata item (Q16222597), which might not completely be the case. D3fk dev (talk) 10:10, 28 July 2022 (UTC)