Wikidata:Property proposal/HTML entity
HTML entity edit
Originally proposed at Wikidata:Property proposal/Term
Description | string that represents this character in an HTML page |
---|---|
Represents | SGML entity (Q285300) |
Data type | String |
Domain | character (Q3241972) |
Allowed values | &[A-Za-z0-9]+; |
Example |
|
Source | https://dev.w3.org/html5/html-author/charref |
Planned use | The data should be completed. |
See also | Unicode character (P487) |
- Motivation
- It would be nice to have character ↔ HTML entity mappings in Wikidata.
- Discussion
- Open questions:
- Should we include & and ; at the beginning and the end, respectively?
- Should this be a qualifier of Unicode character (P487)?
- Matěj Suchánek (talk) 09:00, 13 November 2017 (UTC)
- Support I would include & and ; as I'm more used to see HTML entities written with these characters. --Pasleim (talk) 20:28, 13 November 2017 (UTC)
- Support we should link to an authoritative source website for this. ArthurPSmith (talk) 21:12, 13 November 2017 (UTC)
- @ArthurPSmith: Maybe this? Giovanni Alfredo Garciliano Díaz ★ diskutujo 23:28, 13 November 2017 (UTC)
- This sounds good, so let's include & and ;. ChristianKl (✉) 12:27, 14 November 2017 (UTC)
- @ArthurPSmith: Maybe this? Giovanni Alfredo Garciliano Díaz ★ diskutujo 23:28, 13 November 2017 (UTC)
- Support Giovanni Alfredo Garciliano Díaz ★ diskutujo 23:28, 13 November 2017 (UTC)
- Support David (talk) 08:17, 14 November 2017 (UTC)
- Support; though I think this should be an external identifier; both as the string does "identify" the entity, and so that we can use formatter URLs. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:46, 14 November 2017 (UTC)
- Support as external identifier, though in that case we should find a formatter URL for a site describing these entities. Mahir256 (talk) 17:18, 14 November 2017 (UTC)
- It's not a unique identifier since multiple strings can represent the same thing (for example | | | all represent a vertical bar). String datatype is correct here. ArthurPSmith (talk) 18:46, 14 November 2017 (UTC)
- External identifier was also an idea by me... Multiple characters cannot be mapped to a single HTML entity. Wikidata's representation of "symbols" is quite immature, though. With Ä (Q9987), both lower and upper case symbol can be represented, each having a different Unicode character (P487) and also HTML entity. I think we will need to consider creating entities for each letter variant, separate from the common understanding of a "letter", which might later turn out to be useful with Wiktionary integration. Matěj Suchánek (talk) 08:47, 15 November 2017 (UTC)
- External IDs must uniquely identify a subject; but need not be unique in doing so; we have several properties for which there can be more than one ID for a subject, ranging from VIAF to listed buildings in England. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:33, 17 November 2017 (UTC)
- External ID's are really not very useful if there are more than rare exceptions to the uniqueness relationship. It makes linking and lookups much harder. I think HTML entities has too many exceptions to qualify in this case. Plus we have no formatter URL so no advantage there. ArthurPSmith (talk) 19:57, 20 November 2017 (UTC)
- It's not a unique identifier since multiple strings can represent the same thing (for example | | | all represent a vertical bar). String datatype is correct here. ArthurPSmith (talk) 18:46, 14 November 2017 (UTC)
- Support Just checked and fortunately while Mediawiki automatically formats into the respective character, Wikibase doesn't. ChristianKl (✉) 20:07, 15 November 2017 (UTC)
- Support - Surprised this doesn't exist already. -- Fuzheado (talk) 21:02, 16 November 2017 (UTC)
- It's not yet ready given that we have description. ChristianKl (✉) 15:41, 23 November 2017 (UTC)
- @ArthurPSmith, Fuzheado, Giovanni Alfredo Garciliano Diaz, Pasleim, Mahir256: @Pigsonthewing, Matěj Suchánek, ديفيد عادل وهبة خليل 2: Done Created as HTML entity (P4575). ChristianKl (✉) 14:44, 24 November 2017 (UTC)