Deprecating values Edit

Please do not make changes like this, deprecating more precise data in favour of less precise data. This is especially bad when it turns out the data from EPE's source is simply Gaia data but truncated. Huntster (t @ c) 21:06, 18 April 2023 (UTC)Reply[reply]

@Huntster thanks for spotting this. What I'm trying to do is the automated synchronization mechanism between various astrodata sources and wikidata. As a part of that, we need a heuristic that tries to decide which single measurement should be featured by rank.
First of all, there has to be a way to manually override this mechanism:
  1. You can set preferred rank to a certain statement and heuristic stops assigning any ranks for this combination of item/property
  2. You can deprecate a specific statement with reason for deprecated rank (P2241) and heuristic stops changing ranks for this specific statement (see Q113449733#P2051 where they clearly messed up with the measurement units)
So if you want to interfere and stop automated ranking - feel free to do it. Human judgement is always better than heuristic.
Regarding your proposal: precision might work sometimes (and I'm planning to use it for time of discovery or invention (P575) as we speak), I don't think using them for all measurements is a good idea. I can see a lot of absurdly high precision values in all databases, because their maintainers did not bother to correctly round their calculation (for instance Q597170#P2051. From my experience, much better heuristic is aiming for the latest publication date of the sources. Besides the fact, that it works most of the time, it motivates the following actions:
  • specifying real sources for the statements (not stated in (P248)Extrasolar Planets Encyclopaedia (Q1385430))
  • specifying accurate publication date for sources
  • (if nothing else works, e.g. Keppler et al was published after Gaia DR2), finding the most recent measurements like this
What do you think? Ghuron (talk) 05:05, 19 April 2023 (UTC)Reply[reply]
Certainly, there has to be a reasonable degree of precision considered. Common sense also must play a part as well. In the example I gave for PDS 70, it was "113.4314±0.5211" vs. "113.43±0.52". It's obvious what has happened here, backed by Birnsteil et al citing Gaia 2016,2018 (i.e. Gaia DR2) as their source. Using the latest publication is often a useful metric, but using smallest uncertainty + logic is often superior. Huntster (t @ c) 06:49, 19 April 2023 (UTC)Reply[reply]
Well, when you are writing code, it is a bit difficult to formalise "reasonable degree of precision" and "common sense"  
Help me understand your idea. For PDS 70 I was choosing between the following candidates for normal rank:
what you are saying is that because the second measurement is basically the rounding of the first one (for amount, lower and upper limit) I should disqualify the second measurement before starts looking for the most recent reference publication date? Ghuron (talk) 07:14, 19 April 2023 (UTC)Reply[reply]
@Huntster I have a couple of questions regarding some of your edits. You seem to be the only wikidata editor interested in exoplanet measurements, and I want to make sure my automatic edits are consistent with your understanding. Would you mind starting a chat on the messenger/irc of your choice? Ghuron (talk) 10:16, 19 April 2023 (UTC)Reply[reply]
Hah, it was very late and I just looked at the first author in the WD item for Discovery..., lol. And I do understand there are limitations when coding. Regarding the precision question, when I saw the two numbers I knew almost immediately what had happened because it's just too convenient that the newer one was exactly the same minus the two decimal places, so I checked their sources. Coding for such a check might not be easy, I'm not sure, since while it might be the same numbers, it could also be one number off (due to rounding) at an unknown number of decimal places.
I'd be happy to have a chat, it's just a matter of when and where. I just got to work at the time of this writing, so it couldn't be 'soon', and I have a limited social media presence. Do you have IRC or Discord? Huntster (t @ c) 13:20, 19 April 2023 (UTC)Reply[reply]

Deprecated ranks for true stratements Edit


You shouldn't change instance of (P31) statements as deprecated rank (Q73737357) when they are indeed true and confirmed - for example on Hamal I think these three should have been kept as "normal".

Thanks, Romuald 2 (talk) 16:43, 2 May 2023 (UTC)Reply[reply]

I erroneously thought that there should be one correct instance of (P31). I can see now that there are many cases when more than one is desirable. Sorry about that, planning to run new batch and fix it. Ghuron (talk) 16:51, 2 May 2023 (UTC)Reply[reply]
Thanks a lot and thank you for your hard work there ;) Romuald 2 (talk) 18:54, 2 May 2023 (UTC)Reply[reply]

Категоризация изобразительного искусства Edit

Доброго времени суток. Объясните, пожалуйста, почему в 2019 году (или ранее) редакторы рувики решили, что ru: Категория:Изобразительное искусство <страны> = en: Category:Art in <country>? Если онлайн-переводчики и редакторы ВД не ошибаются, то ru: Изобразительное искусство = en: Visual arts. На Викискладе есть целая ветвь категорий Category:Visual arts of <country>, которая к ВД не прикреплена. --Ыфь77 (talk) 14:24, 23 July 2023 (UTC)Reply[reply]

2) Сразу второй вопрос из той же темы: почему элементы de: Kategorie:Bildende Kunst (<Land>) не соединены ни с одним отдельным ru: Категория:Изобразительное искусство <страны>? Я начал соединять, но возникло сомнение, хотя ru: Изобразительное искусство = de: Bildende Kunst, но в немецкое Bildende Kunst входит архитектура (как и в английское visual arts), а вот в русское ИЗО — нет. --Ыфь77 (talk) 15:41, 23 July 2023 (UTC)Reply[reply]
Мне лестно что вы считаете меня специалистом в этой теме, но увы Ghuron (talk) 06:05, 24 July 2023 (UTC)Reply[reply]
Вы в 2019 году что-то в автоматическом режиме делали с этими категориями, поэтому, зацепившись взглядом в истории правок за знакомый ник, я эти вопросы и задал. Но всё же хочу услышать личное мнение более опытного участника: стоит ли разъединять ru: Категория:Изобразительное искусство <страны> и en: Category:Art in <country>, а также объединять ru: Категория:Изобразительное искусство <страны> и de: Kategorie:Bildende Kunst (<Land>)? 3) Или хотя бы напишите, кого из русскоговорящих участников ВД Вы считаете специалистом в этом вопросе (потому что отменять соединения элементов дольше, чем их соединять). --Ыфь77 (talk) 09:25, 24 July 2023 (UTC)Reply[reply]
Я человек имеющий техническое образование (не самого высокого качества, да и не первой свежести, чо уж), поэтому я, действительно, ковырялся в категориях, но на 99% в тех, в которых критерий попадания сформулирован предельно чётко типа Q60869679 Сообществу викиданных, как вы возможно успели заметить, в целом начхать на сайтлинки. Они вообще по сути некий реверанс, сделанный в сторону остальных проектов викимедиа, как основных клиентов.
Соответствие между ru-вичным и de-вичными категориями могут взволновать исчезающе малое число категорийных задротов в обоих проектах, поэтому на мой взгляд, правьте смело (что, собственно, вы уже и делаете). Я бы вам рекомендовал включить гаджет Move чтобы перетаскивать сайтлинки между элементами пачками. Ghuron (talk) 19:17, 24 July 2023 (UTC)Reply[reply]
Спасибо за ответ. 1) Как я понял по истории правок, путаница ru: Категория:Изобразительное искусство <страны> = en: Category:Art in <country> старше самой ВД! Распутываю в тех местах, где попадается. 2) Гаджет включён и использован, жаль, что он вместе со "сайтлинками" метки не переносит. 3) "Сообществу викиданных, как вы возможно успели заметить, в целом начхать на сайтлинки" -- и очень жаль. Отмена сегодняшнего утра: наше скотоводство соединено с английским pastoral farming, а не с cattle husbandry, к которому ближе. Но дочерние элементы я-то привязываю по русским связям. Ыфь77 (talk) 19:44, 24 July 2023 (UTC)Reply[reply]

Small issue Edit

Hello, it looks like your bot keep reverting itself in an infinite loop on this item: Kepler-91b (Q15460475). There might be other items affected. Romuald 2 (talk) 13:59, 5 September 2023 (UTC)Reply[reply]

Hi again! So lately I've been editing items about exoplanets and I've noticed there is quite a lot of duplicated elements, ranging from having the same exact title to using a different designation (sometimes it's just the database being weird). e.g. BD-21 397 b (Q120292596), BD-21 397 b (Q121716495), BD-21 397 b (Q120727941) and BD-21 397 b (Q119951732) were all about the same exoplanet! It would be interesting for the bot to check for the existence of duplicata. Thanks, Romuald 2 (talk) 20:18, 17 September 2023 (UTC)Reply[reply]
It checks, but rely on the mw:Extension:CirrusSearch which doesn't seem to be reliable.
I'm planning to rewrite this piece of code this week and get rid of duplicates.
Glad that someone is interested in what I'm doing :)) Ghuron (talk) 05:57, 18 September 2023 (UTC)Reply[reply]
Thank you for checking this out, yeah the extension probably needs some improvement in this case. The exoplanets ARE interesting indeed :) on the french Wikipedia, there have been some debates recently about how we could improve on the usage and the updating of the lists of exoplanets, and we've been considering using Wikidata to help us. Romuald 2 (talk) 13:15, 18 September 2023 (UTC)Reply[reply]
Sorry, I didn't really go into the details of what you said. What you're talking about here is a slightly more complex scenario. There are several different scripts running under my name that load data from different sources. What you seem to be asking for is the ability to automatically decide that Extrasolar Planets Encyclopaedia exoplanet ID (P5653)BD-21 397 b is the same as NASA Exoplanet Archive ID (P5667)BD-21 397 b. I'm not sure this would be a good idea. Sometimes exoplanets are "discovered", "retired" and "rediscovered" again under the same name, a consistent naming scheme is not followed even within the same source.
I would prefer that these loaders work independently of each other, create all necessary items and someone with a human brain merges them here. I understand that this add some burden on other users, and I promise to do this quarterly by myself (I'm doing it as we speak).
But there is another point: not all exoplanet elements correspond to confirmed exoplanets. For some of them, one source said "confirmed" and another said "unconfirmed." Real "true" exoplanets are associated with their parent star via child astronomical body (P398)exoplanet-itemseries ordinal (P1545)letter. I would recommend that your colleagues at fr-wiki use only these exoplanets. In ru-wiki we do this (see #P1545 and ru:Список экзопланет в созвездии Большой Медведицы as an example). Ghuron (talk) 07:17, 19 September 2023 (UTC)Reply[reply]
I mean, if two exoplanets have more or less the same identifier, and have more or less the same date of discovery (let's say around the same year, because one database would say "may 2023", but the other "2023"), then we can reasonnably assume they are the same object. Otherwhise we would need to create new items for every single entry for every single database, which goes against the very principle of Wikidata imho.
We can always put the nuances of "unconfirmed" vs "confirmed" vs "controversial" in the items. I've noticed indeed that Simbad is much more careful about the "confirmed" status of an exoplanet (and the updates are probably slower, too) than the EPE for example which is sometimes too generous. And sometimes some previously "confirmed" exoplanets become controversial or retired, but that's how science goes I guess. I wonder what the best way of dealing with different statuses would be. Romuald 2 (talk) 13:35, 19 September 2023 (UTC)Reply[reply]
The problem is with "more or less", because I am spending too much time "unmerging" wrongly merged items. But yesterday I've realized 2 simple cases that can be handled automatically with a reasonably low error rate:
  1. If Extrasolar Planets Encyclopaedia exoplanet ID (P5653) with specific id does not exist, but NASA Exoplanet Archive ID (P5667) does (or vice versa) robot can assume that they are the same thing
  2. Sometimes NASA Exoplanet Archive ID (P5667) get updated (when transitioned from "candidate" to "confirmed" status) and there are "confirmed names" tables at NASA Exoplanet Archive, robot should attempt to use it
It will cover the vast majority of cases. I guess I didn't realize how MANY new items were being created each run :) Ghuron (talk) 05:03, 20 September 2023 (UTC)Reply[reply]
Thanks for the update and for the corrections. Do you have some examples of iteams wrongly merged, so I can have an idea for what *not* to merge? Romuald 2 (talk) 21:42, 20 September 2023 (UTC)Reply[reply]
This one should work: [1]
It is definitely the same system, "01" normally becomes "b" when confirmed, indicates that there is only one planet in the system, etc. I did this a lot of times, and 99% of them are correct, but this one is not:
This has orbital period of 142 days, and this only 13 days and thus ~5x closer to the host star.
It is not exactly a "merge" (but there were merges like that, I'm just too lazy to find), but data from 2 different exoplanets was loaded into one element, so I had to "unscrew" them to KOI-3936.02 (Q75146435) and KOI-3936.01 (Q113594025) Ghuron (talk) 05:08, 21 September 2023 (UTC)Reply[reply]
Woah, this is very confusing! Yes, that was a very good example. Romuald 2 (talk) 12:55, 21 September 2023 (UTC)Reply[reply]
So here is a sample of some NASA Exoplanet Archive ID (P5667) planets were "connected" to the existing items, but some other still created. Let me know if something need to be improved here. Ghuron (talk) 09:44, 21 September 2023 (UTC)Reply[reply]
All of these modifications looks correct, and I don't see something obvious that needs to be improved here. I've noticed that Simbad has an entry for GJ 1151 c (Q122746175) (but it is described as a candidate) so I've added it, but strangely enough it is not present on the EPE yet. Also TOI-332 b (Q121913979) might be present on Simbad as TOI-332.01, but not under the name "TOI-332 b" (yet), and I cannot link the two with the discovery papers, this is a case of "maybe it is the same object, maybe not, so let's not link these for now"... Romuald 2 (talk) 12:55, 21 September 2023 (UTC)Reply[reply]

P1545 Edit

Приветствую, коллега!

1. Иногда сталкиваюсь с тем, что в child astronomical body (P398) отсутствует квалификатор series ordinal (P1545), из-за чего в разделе "Планетные системы" шаблона "Звёзды созвездия ххх" не отображается соответствующая планетная система. Можно ботом проверить такое и поправить?
2. Когда вносите новые планетные системы в викиданные, можно мне сбрасывать список? (или лог; если сделаете, и по первому пункту тоже) M. Dick (talk) 00:07, 11 September 2023 (UTC)Reply[reply]
  1. Фокус в том, что экзопланеты не всегда подтвержденные (или предыдущие работы подвергнуты сомнению). В этих случаях я не проставляю букву в порядковый номер и они не показываются в шаблоне и списках. Сделать это ботом затруднительно, потому что пока наблюдается небольшой бардак в instance of (P31)
  2. Я напишу запросы, только чуть попозже, сейчас швах на работе.
Ghuron (talk) 04:49, 11 September 2023 (UTC)Reply[reply]
Вас понял! 1. Я имел в виду, конечно же, случаи, когда статус - "подтверждён". Т.е., P398 есть, статус на - confirmed, а P1545 - отсутствует. M. Dick (talk) 08:55, 11 September 2023 (UTC)Reply[reply]