About this board

Logo of Wikidata

Welcome to Wikidata, Vojtěch Dostál!

Wikidata is a free knowledge base that you can edit! It can be read and edited by humans and machines alike and you can go to any item page now and add to this ever-growing database!

Need some help getting started? Here are some pages you can familiarize yourself with:

  • Introduction – An introduction to the project.
  • Wikidata tours – Interactive tutorials to show you how Wikidata works.
  • Community portal – The portal for community members.
  • User options – including the 'Babel' extension, to set your language preferences.
  • Contents – The main help page for editing and using the site.
  • Project chat – Discussions about the project.
  • Tools – A collection of user-developed tools to allow for easier completion of some tasks.

Please remember to sign your messages on talk pages by typing four tildes (~~~~); this will automatically insert your username and the date.

If you have any questions, don't hesitate to ask on Project chat. If you want to try out editing, you can use the sandbox to try. Once again, welcome, and I hope you quickly feel comfortable here, and become an active editor for Wikidata.

Best regards! --Tobias1984 (talk) 22:23, 13 May 2014 (UTC)

Previous discussion was archived at User talk:Vojtěch Dostál/Archive 1 on 2022-06-21.

Gampe (talkcontribs)
The Wikidata Barnstar
Děkuji za rychlou a účinnou pomoc se šablonou Česká divadelní encyklopedie!

--Gampe (diskuse) 10:07, 18 October 2024 (UTC)

Reply to "Poděkování"

errors originating from your kmol.cz imports

3
Jamie7687 (talkcontribs)

This data source seems to be riddled with errors (I've caught 5 so far today). I've been deprecating the erroneous claims as I find them, but do you have any other thoughts on what could be done about these? Do we need to import every error on that website, or can you do some additional checking (e.g. if the site says a person died before they were born and/or were born in the far future), and maybe reach out to them to fix the issues? Thanks,

Vojtěch Dostál (talkcontribs)

Hi, if it's just five or so out of 5,000, that's not so bad. However I'll do a check for these 'obvious' mistakes - I usually do that with OpenRefine but I had to use QuickStatements because OpenRefine has an unpleasant bug.

Vojtěch Dostál (talkcontribs)

Most of the wrong dates I found were already found by you, good job. I fixed just one or two more.

Epìdosis (talkcontribs)

Hi! I'm having a look at cases of descritto nella fonte (P1343) with qualifier URL (P2699) and I found these items (https://w.wiki/BBwY) with descritto nella fonte (P1343)Q124625144. I think these could be converted into an external identifier: do you see any possible drawback? Thanks as always!

Vojtěch Dostál (talkcontribs)

Hi! I see no drawback for now, but these websites tend to be terribly short-lived and I can't predict how long it will last.

Epìdosis (talkcontribs)

I see. At least with an external ID we can very easily change the formatter URL to an eventual new domain or, in the unfortunate case of a shutdown, to an archival URL. If you want I can make the property proposal.

Vojtěch Dostál (talkcontribs)

I have no problem with that, thanks.

Bean49 (talkcontribs)
Vojtěch Dostál (talkcontribs)
Bean49 (talkcontribs)

Thanks. Not until now.

Vojtěch Dostál (talkcontribs)

I fixed all the occurences manually.

Bean49 (talkcontribs)

Why do you duplicate the value? Without preferred rank, it breaks many uses on Wikipedias. And violates constraint here on Wikidata. I reported it before.

Vojtěch Dostál (talkcontribs)

Please can you provide an example where the same value was duplicated?

Bean49 (talkcontribs)

I provided you. See my first message. First you duplicated it with tool, then manually.

Bean49 (talkcontribs)

Twice with tool, removed one manually.

Vojtěch Dostál (talkcontribs)

Maybe you are mixing up two different 'errors', one of which is described at https://github.com/OpenRefine/OpenRefine/issues/6798 and the other (presence of two seemingly identical dates, but with different calendars) is not an error at all. Or am I not seeing something?

Bean49 (talkcontribs)

Please set preferred rank for one of it, as required by the constraint and many use cases. Thank you.

Vojtěch Dostál (talkcontribs)

That's a nice theory, but how do you propose to do that automatically for thousands of cases where the birth dates differ by a few days? There's no way to determine which of them is correct. Same with calendar dates, we can't really be sure which piece of data is correct.

Bean49 (talkcontribs)

The problem is artificially created by you, not by nature. Now we have two identical dates. Great! Perfect! Thank you for your contributions. Please fix at least this item. Thank you.

Vojtěch Dostál (talkcontribs)

Please advise, was he born on 1st December of Gregorian calendar or Julian calendar? Happy to fix it then.

Bean49 (talkcontribs)
Vojtěch Dostál (talkcontribs)

I think Czech Wikipedia uses limit=1 to display only one value. But the implementation is different in each wiki...

Bean49 (talkcontribs)

I kindly ask you, pick one and set preferred rank, doesn't matter which one, because you duplicated it arbitrarily. As an alternative, you can delete one of them. Or do you intend to duplicate all dates where the calendar isn't specified in the source?

Vojtěch Dostál (talkcontribs)

What is kind about that? You are pressing me to disrupt Wikidata, seemingly to serve the purposes of some arbitrary Wikipedia version which does not use Wikidata properly.

Bean49 (talkcontribs)

You get me wrong. You created an artificial problem. I kindly ask you to clear it. Otherwise I will have to ask the community. I don't understand why do you think that it is right what you did.

Vojtěch Dostál (talkcontribs)

Yes, please go and ask the community if it is really OK to arbitratily choose one value as preferred. Let's find a systematic solution, not an ad hoc decision on one item.

Summary by Vojtěch Dostál

seems ok

Bernice Heiderman (talkcontribs)

Ahoj Vojtěch,

all links to aleph.nkp.cz are offline since a few days. Do you know more about this problem?

JAn Dudík (talkcontribs)
ŠJů (talkcontribs)

@Bernice Heiderman, JAn Dudík: I hope, planned maintenance would probably be properly announced. But it is possible that the inability to react immediately to the outage was related to the holidays. In some institutions, I suspect that some overzealous worker shuts down too many computers when he leaves his workplace.

Helena Dvořáková (talkcontribs)

Ahoj Vojto,

tak jsem si pro zábavu spustila uschovaný tvůj dotaz https://w.wiki/3MsN na podivně zdvojená data narození. Co jsem zkoumala, zdá se, že některé vznikly slepením dvou záznamů z REGO. Něco jsem už vyřešila založením nové položky (např. Jan Drábek) a budu v tom pokračovat. Spíš by bylo dobré přijít na nějaký způsob, jak nepatřičnému propojení zabránit.

Vojtěch Dostál (talkcontribs)

Ahoj Heleno, tvou zpravu zde registruji ale nemel jsem jeste cas odpovedet, promin.

Slepeni dvou polozek je vzacny bug v OpenRefine, sleduji to delsi dobu a nepodarilo se mi porozumet presnym pricinam vzniku.

Nemelo by toho snad byt mnoho. Nebo jsi nasla takovych spojenych REGO zaznamu vice? Pokusim se na to udelat dotaz, ale je to slozite, nektere REGO zaznamy jsou duplicitni a tedy opravdu do 1 polozky patri.

Helena Dvořáková (talkcontribs)

Tak jsem některé prozkoumala a rozdělila. Ale u nových položek nevytvářím všechno, jen to nejdůležitější. Snad ty nebo někdo jiný občas spouští doplňování jmen a příjmení podle labelů tam, kde to dosud není...

MIGORMCZ (talkcontribs)

Ahoj, sem tam zakládám položky pro jezy a až nyní jsem si všiml, že jsi je importoval z databáze Centrální evidence vodních toků. Při tvém importu ovšem nedošlo k vyplnění administrativní jednotky a proto se tebou importované položky nezobrazují na některých mapách a zbytečně tak vznikají duplicity (používám Wikidata Query Service a filtr dle okresů, takže toho tvého importu jsem si až do dnes nevšiml). Rovněž občas nesedí názvy jezů tvého importu s nynějším stavem evidence vodních toků (tebou založená položka Q123547704 jez Choceň II je ve zdrojové databázi podle mapy vedena jako Choceň I: viz https://voda.gov.cz/?page=jezy-mapa a naopak Choceň I jako Choceň II). Šly by tyto dvě chyby nějak strojově opravit (adm. jednotky a názvy)?

Vojtěch Dostál (talkcontribs)

Ad jezy v Chocni - v Mapy.cz je to naopak a odpovídá tomu i obsah Category:Jez_Choceň_II. Vycházel jsem asi z toho. Co je správně? Důležité je aby objekty byly spárovany k sobě (na základě říčního kilometru), jméno je až druhořadé...


Administrativní jednotky doplním roboticky kde to půjde podle souřadnic - ve zdrojové databázi nejsou.

MIGORMCZ (talkcontribs)

Díky za doplnění těch jednotek. Co se týče těch názvů, tak i u Brandýsa nad Orlicí jsem si všiml nesrovnalostí a to tam commonscat není. Jediné, co mě napadá, že mohlo od importu dojít ke změně v té databázi.

Wrong aliases and official names of czech remarkable trees

2
PaperHuman (talkcontribs)

Often times, the names in ÚSOP are not official names but instead just the taxon names. For example https://www.wikidata.org/w/index.php?title=Q11775258&oldid=1901567939 contains "Lípa malolistá" as the official name which is only the taxon. "Lípa malolistá" also shouldn't be an alias there.

There are ~100 items with "Lípa malolistá" as the official name, ~50 more with "Lípa velkolistá" and others with different taxon names.

Vojtěch Dostál (talkcontribs)

Yes, unfortunately the taxon name is sometimes (often) the official name. There is not much we can do.


Sometimes there's a better (descriptive) name in the official database and we try to use it, but if I remember correctly it's not easy to find. I cannot find a different name for the example you have given in the official databases.

Item about trees

34
Summary by Vojtěch Dostál

Merged Q811534 with Q10065268. Remaining issue is if to use P1435 for trees - if case we find consensus on that, let me know.

VIGNERON (talkcontribs)

Hi,

I see that you imported a lot of trees from the Czech Republic. That a great work, thanks a lot ! we might take inspiration for importing trees from France.

Thad said, I have a couple of question: why use arbre remarquable (Q811534) in nature de l’élément (P31) and not arbre (Q10884) in nature de l’élément (P31) and distinction reçue (P166) or statut patrimonial (P1435) (like we do for protected buildings). Some data are a bit strange and contradictory, Q26779918 is not a tree but a group of trees, why not just leave bosquet (Q1510380)? Also the quantity in comprend des éléments de type (P2670) are strange, is there 2, 3 or 8 trees? (looking at the source it seems it's 3 trees with only 2 of them protected but it's unclear and my Czech is not good).

I'd love to hear what you think.

Cheers,

ŠJů (talkcontribs)

The DRÚSOP register has two columns: "poč vyhl." (počet vyhlášený, number of originally declared) and "poč. souč." (počet současný, current count). The importer did not use any qualifiers to distinguish the two numbers.

Vojtěch Dostál (talkcontribs)

Hi, it's been some time since the import happened and it's true that I would change some modelling nowadays.

At first glance, having P31 : strom (Q10884) or skupina stromů (Q1510380) sounds like a good idea but I am not sure where to put památný strom (Q811534). Protected tree is something like a nature reserve (a category of protected area) in the Czech Republic and we tend to use P31 for these protected area designations (see Prachovské skály (Q452242) for example). Therefore, we understand památný strom (Q811534) as a type of protection designation similar to national monument or national reserve, no matter how many trees are included. The strange data in zahrnuje (P2670) are a mistake by LinkedPipes ETL Bot and I'll try to look into it when time allows...

VIGNERON (talkcontribs)

Thank for the quick answer.

As I said, I would put arbre remarquable (Q811534) (or a more specific sublass?) in distinction reçue (P166) or statut patrimonial (P1435). For me, a label or a protection is the same, wether it's a Nobel prize, a protected building or a protected tree.

There is no hurry, we can take time to think about it. For more point of view, I'm also pinging Nikola Tulechki who worked on trees in Bulgaria, Nemo bis for Italia, Lodewicus de Honsvels in Germany and Pere prlpz in Catalonia.

Pere prlpz (talkcontribs)

Hello.

In my view, arbre singular (Q811534) means any notable tree, that is any tree that is covered individually by reputable sources. Some of them are included in official natural heritage catalogues or have some kind of legal protection, like arbre d'interès local (Q115867635). However, arbre singular (Q811534) is a value of instància de (P31), but arbre d'interès local (Q115867635) is a value of estatus patrimonial (P1435), as we do for buildings.

It would be possible to use arbre (Q10884) as instància de (P31) instead of using arbre singular (Q811534) and it would be fairly reasonable, but I see a couple of problems with that:

Of course, there is an inconsistency in Wikidata between how we treat trees, buildings and people, specially in instància de (P31). For buildings we take a quite concrete instància de (P31) (like church or cathedral), for people we stick to human and all individual characteristics go to other properties and for living beings we take the middle ground of animal individual (Q26401003) and arbre singular (Q811534). I suppose we could take a different and unified approach and try to reduce the number of values of instància de (P31) (or expand them) across Wikidata, but that would go far beyond trees.

Where I'm usually doubtful is about what to do with small sets of trees, but also small sets of anything else (two buildings, two people, two hills...). To make things more complex, as far as I know, such sets of a few trees are usually protected in Barcelona as arbre d'interès local (Q115867635) and not as the equivalent protection for groves ("arbreda d'interès local", still not present in Wikidata). Therefore, I tend to use for them the same properties as for a single tree, which doesn't feel like a very satisfactory solution - although I think I've encountered only a few of such cases.

Nemo bis (talkcontribs)

I've not looked into the import and I don't have a specific opinion to add. Where there is some doubt, I prefer a statement to be repeated in multiple properties: if Q811534 is stated both in P31 and P1435, then it will be easier for people to find what they need with an individual query even if they're not aware of the more specific classes or properties. What matters is only that it's possible for those who care to narrow down the results to more specific definitions (e.g. designations which use a specific official source as reference).

(Unrelatedly, P1435 has a horrible label in French and Italian, as "patrimonio" sounds like everything needs to be treated for its property/capital/money value. I despise it.)

Pere prlpz (talkcontribs)

A few informative queries about instància de (P31):

There are some thousands of arbre singular (Q811534) https://w.wiki/7gET but only a couple of arbre (Q10884) https://w.wiki/7gEW

By looking at the map of all items with coordinates and individu del tàxon (P10241) https://w.wiki/7gEb I would say that:

  • Somebody in Portugal, Estonia or some Austrian land may be interested in this discussion. I can't check now and notify.
  • There are a lot of legal status used as instància de (P31). That's different of what we do with buildings, AFAIK, where we put the status in estatus patrimonial (P1435).
Vojtěch Dostál (talkcontribs)
ŠJů (talkcontribs)

There is some inconsistency in labels (and aliases) of "Q811534". Some of them mean a specific type of protection (regardless of the number), some of them a general significance of any type, and in some languages ("es" and surroundings) just any "single tree".

VIGNERON (talkcontribs)

My current reasoning is as follow :

What do you think?

Cheers,

Pere prlpz (talkcontribs)

The wordings of labels of arbre singular (Q811534) are quite different but the ones I can understand convey a similar meaning "tree of interest", "tree of heritage value", "tree of cultural or natural significance", "notable tree"... I am missing labels that mean a specific type or protection or that imply legal protection?

Alias are more varied and sometimes have disparate meanings for the same language (for example, for Romanian I'd say they range from individual tree to protected tree). I take this just as a consequence of not having items for protected tree or monument tree and using a single item for the instances of all individual trees.

About the inconsistency between meanings "individual tree" and "notable tree":

  • By now, I would say that they are quite equivalent in Wikidata. If a tree has an item, it follows the rules in Wikidata:Notability and this means that it has been described as a reliable source. Therefore, all individual trees present in Wikidata are notable trees, just as all animal individual (Q26401003) are notable animals (Talk:Q26401003#Label is an interesting short debate about the same question for animals).
  • Notability threshold in Wikidata is pretty low. After seeing that somebody uploaded to Wikidata all streets of Brussels or Toulouse, all hotels in Barcelona or all houses in some neighbourhoods of Prague among other sets of non famous things, I wonder if somebody else will eventually create items for all individual trees int he streets of Paris or Sidney. If that happens we could need different items for "notable tree" and "individual tree", although at the moment I can't see that coming.
Pere prlpz (talkcontribs)

My previous answer was written at the same time as Vigneron's. This is an addition after reading his one.

I don't oppose creating different items for "famous/notable tree" and "individual tree", although I find difficult to tell apart one from the other. The only criteria I can think of is that "notable trees" have a proper name or legal protection as an individual tree or small group, and I'm not sure if this criteria is consistent even in my city.

VIGNERON (talkcontribs)

For me, and if you get rid of the notion remarkable tree than mean everying and nothing, the difference seems easy an obvious : all trees are indivudual trees, only the few ones with a specific protection or award are protected/awarded trees. Hence, we use P31 = tree (and just that) for all of them and for the others we complete with P166 or P1435.

Pere prlpz (talkcontribs)

You have a point that remarkable tree means everything and nothing.

My biggest doubt in using P31 = tree for all trees is what happens if at some point Wikidata is flooded with trees from an exhaustive register of trees of somewhere, because we would need some way tho tell apart the notable ones (the ones covered individually by some reliable source) from all trees. That situation seems unlikely for trees in the short term, but something similar happened in France with instal·lació esportiva (Q1076486) and since even the smallest private sports center has P31 of sports venue it would be very hard to make a list of notable sports venues in France (libraries in Spain are in a similar situation).

Using legal protection and awards may be useful, but there are notable trees (covered by reputable sources) that don't have legal protection. For example, https://patrimonicultural.diba.cat/element/roure-sam or the trees marked (with a proper name) in Mapa Topogràfic de Catalunya (Q63431924), both of which are official reputable sources but aren't legal protections nor heritage classifications.

Maybe I'm overthinking this and I'm preparing for a too unlikely risk.

VIGNERON (talkcontribs)

I hear your concerns (and yes, sport venues/facilities are a mess in France, with a lot of duplicates) and you're right, it may happens with trees *but* there is still WD:N to solve that, and I don't think that "instance of tree" instead "instance of remarkable tree" will really impact this.

Pere prlpz (talkcontribs)
Vojtěch Dostál (talkcontribs)

Both items have sitelinks to Czech Wikipedia so we can use those articles as hints. památný strom (Q811534) is for trees protected by state, while významný strom (Q10065268) is for just about any remarkable tree. This distinction was introduced to Czech Wikipedia by @Xth-Floor and he might be interested in this discussion. I am afraid that the other sitelinks in those two items do not correspond to 'our' definition and it may need some reshuffling, but let's see.

Pere prlpz (talkcontribs)

The sitelinks of arbre singular (Q811534) seem to be mixing both meanings, sometimes in the same article. You have a point that we could use an item for "tree" and another for "protected tree", although that's quite different of what we do for buildings.

Pere prlpz (talkcontribs)
Pere prlpz (talkcontribs)

I notify @Gikü and @GoEThe in case they may be interested in this discussion.

GoEThe (talkcontribs)

Thank you for pinging me. I will certainly follow this discussion and I can try to apply the consensus to the Árvore de Interesse Público (Q52062847) instances I imported, but I do not have strong feelings about what the "proper" way is. I would certainly like them to be in better alignment with other protected trees in Wikidata, to make them more findable, so any tips in this regard are welcome.

Quelet (talkcontribs)

I am currently working in importing all trees in my hometown in OpenStreetMap. That makes some sense, because this allows to detect fallen, sick or missing trees to my local community. But just a few trees are notable, i.e., have a name. In my opinion, only those having a name/being notalbe in some sense, should have the right to be in WD. A similar analysis for streets shows a key difference: streets have a name, importing them in WD may allow to carry out analysis of names, length, etc - even though you might do it as well from OSM data - if OSM streets were well labeled with proper keys.

Vojtěch Dostál (talkcontribs)

Hello @VIGNERON @Pere prlpz @Nemo bis Can we try to wrap up this discussion and identify the key action points, before this discussion is archived?

VIGNERON (talkcontribs)

I agree but I'm not sure what conclusion can be drawn (and since I started the discussion, it's maybe better if someone else close it).

Nemo bis (talkcontribs)

I've not re-read everything but I can't identify any action points here except that it would be nice to document how some of these properties and classes have been used so far. Is there an appropriate project page?

Pere prlpz (talkcontribs)

I can try to make a summary, but I'm afraid it will be a summary about how we disagree, because we didn't agree on much despite the very interesting talk.

In light or our disagreements, any global action we could take or any global recommendation will either leave a lot of redundancy or go against the opinions and practices of some participants, and therefore I can't see a good conclusion that more or less pleases everyone:

VIGNERON (talkcontribs)
Vojtěch Dostál (talkcontribs)

Thanks @Pere prlpz! After reading the conversation again, I think I will merge významný strom (Q10065268) to památný strom (Q811534) and make it clear in Czech label and description that památný strom (Q811534) is not *only* about trees protected by law, as they now suggest. The official item for Czech law-protected trees will then be památný strom v Česku (Q21296252). We can keep památný strom (Q811534) in the instances while památný strom v Česku (Q21296252) should go to památkový status (P1435) if we agree to use this property.

@Adam Hauner @Xth-Floor FYI

Adam Hauner (talkcontribs)

@Vojtěch Dostál, thank you for letting me know about this. I'm not sure, if památkový status (P1435) is appropriate: protection of "památný strom v Česku (Q21296252)" is primary protection of part of nature/natural enviroment, only some of such protected trees are also protected for cultural heritage or historical significance. Could you find better suited property from area of the nature protection?

Olea (talkcontribs)
VIGNERON (talkcontribs)

Olea I'm still unsure for the use of statut patrimonial (P1435)... In think that first, we should really start a broader discussion here on Wikidata to get more point of views (unrelatedly, we can talk about it IRL this weekend ) and then indeed maybe propose to create a new property for "natural designation".

Olea (talkcontribs)

@VIGNERON it will be great to meet you in person :-)

Pere prlpz (talkcontribs)

And I think the idea of not using estatus patrimonial (P1435) for protected areas doesn't translate well to not to use it for trees. An individual tree is not a protected area (nor an area). It's an individual item like a building or an sculpture. Interestingly there are values of estatus patrimonial (P1435) like art públic de Barcelona (Q15945449) that apply to sculptures and to some trees.

Additionally, I'm not sure about what you propose. In your link you propose a new property "protection status of a natural area", but as far as I know it has not been adopted, and therefore it couldn't be used even if it were suitable for trees.

Nowadays, the alternatives to state the status of a tree are using instància de (P31) and using estatus patrimonial (P1435), unless I'm missing some alternative. In other places you have argued for a flatter ontology - which has some merit - and using estatus patrimonial (P1435) provides a flatter ontolgy and less granularity in instància de (P31).

Olea (talkcontribs)

@Pere prlpz With the proposed data model, a tree (say QXXXXX) would be P31 as tree (or a subclass maybe). If the tree is protected, there should a related designation (say QYYYYY). Then, just only need to state QXXXXX localizado en el área protegida (P3018) QYYYYY. Check page 14.

> using estatus patrimonial (P1435) provides a flatter ontology and less granularity in instancia de (P31).

The proposes flatters the ontology thanks to a new property and a consistent data model. Check page 17. This also would fix practical data reuse problems like discerning UNESCO’s World Heritage cultural from natural sites.

I'm not saying this is THE proposal, but it has a lot of previous thought.

Reply to "Item about trees"

Notification about an error of your bot

5
Adam78 (talkcontribs)
Vojtěch Dostál (talkcontribs)
2A02:8428:B02B:2001:3050:DC14:8A88:1D10 (talkcontribs)

Hello, Mr Vojtěch Dostál, you made a mistake, because I speak romanian, but not romani, which is not the same thing, it is a different language. Could you please rectify it ? Many thanks and have a nice day ! Radu Alexandru Negrescu-Suţu

Vojtěch Dostál (talkcontribs)

Thank you, I deprecated the information in Wikidata (marked it as false) and informed the people who run the source database.

Vojtěch Dostál (talkcontribs)

The mistake is already fixed in the source database.