Small issue edit

Hello, it looks like your bot keep reverting itself in an infinite loop on this item: Kepler-91b (Q15460475). There might be other items affected. Romuald 2 (talk) 13:59, 5 September 2023 (UTC)Reply

Hi again! So lately I've been editing items about exoplanets and I've noticed there is quite a lot of duplicated elements, ranging from having the same exact title to using a different designation (sometimes it's just the database being weird). e.g. BD-21 397 b (Q120292596), BD-21 397 b (Q121716495), BD-21 397 b (Q120727941) and BD-21 397 b (Q119951732) were all about the same exoplanet! It would be interesting for the bot to check for the existence of duplicata. Thanks, Romuald 2 (talk) 20:18, 17 September 2023 (UTC)Reply
It checks, but rely on the mw:Extension:CirrusSearch which doesn't seem to be reliable.
I'm planning to rewrite this piece of code this week and get rid of duplicates.
Glad that someone is interested in what I'm doing :)) Ghuron (talk) 05:57, 18 September 2023 (UTC)Reply
Thank you for checking this out, yeah the extension probably needs some improvement in this case. The exoplanets ARE interesting indeed :) on the french Wikipedia, there have been some debates recently about how we could improve on the usage and the updating of the lists of exoplanets, and we've been considering using Wikidata to help us. Romuald 2 (talk) 13:15, 18 September 2023 (UTC)Reply
Sorry, I didn't really go into the details of what you said. What you're talking about here is a slightly more complex scenario. There are several different scripts running under my name that load data from different sources. What you seem to be asking for is the ability to automatically decide that Extrasolar Planets Encyclopaedia exoplanet ID (P5653)BD-21 397 b is the same as NASA Exoplanet Archive ID (P5667)BD-21 397 b. I'm not sure this would be a good idea. Sometimes exoplanets are "discovered", "retired" and "rediscovered" again under the same name, a consistent naming scheme is not followed even within the same source.
I would prefer that these loaders work independently of each other, create all necessary items and someone with a human brain merges them here. I understand that this add some burden on other users, and I promise to do this quarterly by myself (I'm doing it as we speak).
But there is another point: not all exoplanet elements correspond to confirmed exoplanets. For some of them, one source said "confirmed" and another said "unconfirmed." Real "true" exoplanets are associated with their parent star via child astronomical body (P398)exoplanet-itemseries ordinal (P1545)letter. I would recommend that your colleagues at fr-wiki use only these exoplanets. In ru-wiki we do this (see #P1545 and ru:Список экзопланет в созвездии Большой Медведицы as an example). Ghuron (talk) 07:17, 19 September 2023 (UTC)Reply
I mean, if two exoplanets have more or less the same identifier, and have more or less the same date of discovery (let's say around the same year, because one database would say "may 2023", but the other "2023"), then we can reasonnably assume they are the same object. Otherwhise we would need to create new items for every single entry for every single database, which goes against the very principle of Wikidata imho.
We can always put the nuances of "unconfirmed" vs "confirmed" vs "controversial" in the items. I've noticed indeed that Simbad is much more careful about the "confirmed" status of an exoplanet (and the updates are probably slower, too) than the EPE for example which is sometimes too generous. And sometimes some previously "confirmed" exoplanets become controversial or retired, but that's how science goes I guess. I wonder what the best way of dealing with different statuses would be. Romuald 2 (talk) 13:35, 19 September 2023 (UTC)Reply
The problem is with "more or less", because I am spending too much time "unmerging" wrongly merged items. But yesterday I've realized 2 simple cases that can be handled automatically with a reasonably low error rate:
  1. If Extrasolar Planets Encyclopaedia exoplanet ID (P5653) with specific id does not exist, but NASA Exoplanet Archive ID (P5667) does (or vice versa) robot can assume that they are the same thing
  2. Sometimes NASA Exoplanet Archive ID (P5667) get updated (when transitioned from "candidate" to "confirmed" status) and there are "confirmed names" tables at NASA Exoplanet Archive, robot should attempt to use it
It will cover the vast majority of cases. I guess I didn't realize how MANY new items were being created each run :) Ghuron (talk) 05:03, 20 September 2023 (UTC)Reply
Thanks for the update and for the corrections. Do you have some examples of iteams wrongly merged, so I can have an idea for what *not* to merge? Romuald 2 (talk) 21:42, 20 September 2023 (UTC)Reply
This one should work: [1]
It is definitely the same system, "01" normally becomes "b" when confirmed, exoplanet.eu indicates that there is only one planet in the system, etc. I did this a lot of times, and 99% of them are correct, but this one is not:
This has orbital period of 142 days, and this only 13 days and thus ~5x closer to the host star.
It is not exactly a "merge" (but there were merges like that, I'm just too lazy to find), but data from 2 different exoplanets was loaded into one element, so I had to "unscrew" them to KOI-3936.02 (Q75146435) and KOI-3936.01 (Q113594025) Ghuron (talk) 05:08, 21 September 2023 (UTC)Reply
Woah, this is very confusing! Yes, that was a very good example. Romuald 2 (talk) 12:55, 21 September 2023 (UTC)Reply
So here is a sample of some NASA Exoplanet Archive ID (P5667) planets were "connected" to the existing items, but some other still created. Let me know if something need to be improved here. Ghuron (talk) 09:44, 21 September 2023 (UTC)Reply
All of these modifications looks correct, and I don't see something obvious that needs to be improved here. I've noticed that Simbad has an entry for GJ 1151 c (Q122746175) (but it is described as a candidate) so I've added it, but strangely enough it is not present on the EPE yet. Also TOI-332 b (Q121913979) might be present on Simbad as TOI-332.01, but not under the name "TOI-332 b" (yet), and I cannot link the two with the discovery papers, this is a case of "maybe it is the same object, maybe not, so let's not link these for now"... Romuald 2 (talk) 12:55, 21 September 2023 (UTC)Reply

P1545 edit

Приветствую, коллега!

1. Иногда сталкиваюсь с тем, что в child astronomical body (P398) отсутствует квалификатор series ordinal (P1545), из-за чего в разделе "Планетные системы" шаблона "Звёзды созвездия ххх" не отображается соответствующая планетная система. Можно ботом проверить такое и поправить?
2. Когда вносите новые планетные системы в викиданные, можно мне сбрасывать список? (или лог; если сделаете, и по первому пункту тоже) M. Dick (talk) 00:07, 11 September 2023 (UTC)Reply
  1. Фокус в том, что экзопланеты не всегда подтвержденные (или предыдущие работы подвергнуты сомнению). В этих случаях я не проставляю букву в порядковый номер и они не показываются в шаблоне и списках. Сделать это ботом затруднительно, потому что пока наблюдается небольшой бардак в instance of (P31)
  2. Я напишу запросы, только чуть попозже, сейчас швах на работе.
Ghuron (talk) 04:49, 11 September 2023 (UTC)Reply
Вас понял! 1. Я имел в виду, конечно же, случаи, когда статус - "подтверждён". Т.е., P398 есть, статус на exoplanet.eu - confirmed, а P1545 - отсутствует. M. Dick (talk) 08:55, 11 September 2023 (UTC)Reply

Values for "type of variable star" data edit

Hello,

It looks like your script changes values for type of variable star (P881) in favour of less precises velues even when I've manually notified that the less precise value should be deprecated for that precise reason. See for eg. here on X Hydrae (Q59054471). Romuald 2 (talk) 20:14, 7 December 2023 (UTC)Reply

@Romuald 2: thanks for spotting this! As I can see in en:Long-period variable star#Types of variation, although General Catalogue of Variable Stars (Q222662) does not recognize LPV as an "official" class, the literature (and American Association of Variable Star Observers (Q1205564)) does use a notion of LPV and consider MIRA as a part of it. Maybe we should add Mira variable (Q744691)subclass of (P279)long-period variable star (Q1153690) statement.
As of today my script does not recognize subclass of (P279) relations, but even if it does, I usually consider:
as a heuristics for keeping a single value with normal rank.
So either I should not try to keep only one "normal" value of type of variable star (P881) or you can favour Mira variable (Q744691) statement with preferred rank. Ghuron (talk) 07:11, 8 December 2023 (UTC)Reply
Yes, we can add the Mira variable (Q744691)subclass of (P279)long-period variable star (Q1153690) statement, as well on semiregular variable star (Q1054411) and slow irregular variable (Q779609) because these are the two other LPV types widely recognized.
I see that the description for X Hydrae on Simbad is "Mira variable". Maybe there could be a way for your script to recognise that and get the variable type accordingly? In any case, I can start to use preferred rank for Mira variable instead of depreceated rank for LPV variables, I've wondered myself how to proceed when I started to edit variable stars items. Romuald 2 (talk) 15:15, 8 December 2023 (UTC)Reply

GQ Lupi edit

Just curious why you made this edit. Multiple star applies to three or more stars in a system, but GQ Lupi has only two. I don't like calling it a binary system, since it's such a widely separated one, but nothing else really applies. Cheers! Huntster (t @ c) 21:08, 29 December 2023 (UTC)Reply

@Huntster my understanding is that, if there are only two gravitationally bound components, we should use binary star (Q50053). I think I saw an element for broad binary stars, but I can't find it now. However, Visier describes 4, but speaks more or less affirmatively of three components. Ghuron (talk) 07:50, 30 December 2023 (UTC)Reply
SIMBAD has notes on all related pages about it (ex https://simbad.u-strasbg.fr/simbad/sim-id?Ident=USNO-B1.0%200543-00373323); it's specific that what was previously considered the B component (USNO-B1.0 0543-00373323 (Q83532584)) is a false companion, and I suspect one of those Vizier records is referring to GQ Lupi b (Q1344094), where there was previous uncertainty as to whether it was a star or exoplanet. Huntster (t @ c) 08:04, 30 December 2023 (UTC)Reply
I guess you are right:
So GQ Lupi (Q124031718)instance of (P31)multiple star (Q878367) is not wrong I guess Ghuron (talk) 13:36, 31 December 2023 (UTC)Reply
I think it's a little hasty to declare "b" a brown dwarf with any certainty. While SIMBAD simply calls it a low-mass star, most other sources consider it an extremely large exoplanet. I think the most responsible thing until its status is better known is to list both as "possibly". Huntster (t @ c) 14:06, 31 December 2023 (UTC)Reply
We can directly see a bunch of emission lines, but they can be attributed to accretion.
So yes, it can be an oversize exoplanet or low mass star.
I'd say multiple star (Q878367) is a reasonable shot for the system with either 2 or 3 stars. Ghuron (talk) 19:00, 31 December 2023 (UTC)Reply

Batch import from SIMBAD edit

Приветствую, коллега! Вы делали частичный batch import from SIMBAD; можно ли сделать ещё по отсутствующим в Викиданных, но присутствующим в Симбаде (хотя бы в созвездии Гидры, например: UW AE CQ CR CX DS EY EZ FQ FT FY GH GN IU)? M. Dick (talk) 03:34, 12 January 2024 (UTC)Reply

Да, я вроде, наконец, его на 99% реанимировал, но был уверен что с уже Гидрой всё ок
На выходных проверю Ghuron (talk) 05:54, 12 January 2024 (UTC)Reply
Огромное спасибо, коллега! В Геркулесе тоже точно больше сотни переменных отсутствует в Викиданных, да и по остальным, думаю, подобная картина. Как будет время, сделаете?
Кстати, я вижу, вы пропускаете некоторые каталожные коды (а при обновлении те, что я вносил, выпилились), в частности BD, CD, CPD, WISE, WISEA, некоторые другие. Так и было задумано? (Зачем?) M. Dick (talk) 19:41, 14 January 2024 (UTC)Reply
Если вам несложно, накидайте мне примеров. Проще исправлять будет :) Ghuron (talk) 07:52, 19 January 2024 (UTC)Reply
Q66531921 - выпилены CD, CPD и прочие [2]. Ещё? Везде примерно то же самое. M. Dick (talk) 09:25, 19 January 2024 (UTC)Reply
Кстати, OO Гидры осталась пропущенной. M. Dick (talk) 19:05, 7 February 2024 (UTC)Reply
Вот ещё небольшая проблемка (напишу уже здесь, чтобы не заводить новую тему): LW Гидры (Q90857529) не отображается в шаблоне Звёзды созвездия Гидры (видимо потому, что у него в это частный случай понятия прописано планетарная туманность). Как здесь лучше сделать? Убрать Q13632 или вы в коде шаблона поправите? M. Dick (talk) 14:09, 24 January 2024 (UTC)Reply
I think the best option would be to let that item focus on the planetary nebula and create a new item for the binary (trinary?) system, linking the two. Huntster (t @ c) 15:18, 24 January 2024 (UTC)Reply
Maybe. Let colleague Ghuron decide. M. Dick (talk) 16:50, 24 January 2024 (UTC)Reply

Import from Extrasolar Planets Encyclopaedia edit

Hi Ghuron. Your current batch import from EPE is removing uncertainty values and then deprecating the value from a lot of items. For example, YSES 1c (Q108907355). The import appears to be strictly assuming the value is as it is reported in EPE (which truncates uncertainties), but often the data comes from the actual journal article. So, a lot of damage is being done. I'll apologize if I don't respond further tonight, but I've got to sleep for work tomorrow. Huntster (t @ c) 07:17, 25 March 2024 (UTC)Reply

I can see that recently for some reason EPE decided to remove ± for the masses. I can see a couple of mistakes with edits like this
  1. It ignores the fact that 6±1 is taken from Q108907313 according to not only Extrasolar Planets Encyclopaedia (Q1385430) but also Exoplanet Archive (Q5420639)
  2. It uses "old" (marked for deletion) 6±1 statement and thus marked a new one (6) as deprecated with reason for deprecated rank (P2241)item/value with less precision and/or accuracy (Q42727519)
I'm yet to see any cases where data was taken from the actual journal article without stated in source according to (P12132) specified, if you do - let me know.
Give me a couple of days, I'm going to fix problems above and re-run both imports, so the damage will be reverted. Ghuron (talk) 09:15, 25 March 2024 (UTC)Reply
Will do, thanks for the response! Huntster (t @ c) 15:20, 25 March 2024 (UTC)Reply
You're continuing to remove uncertainty figures, such as this edit (I've reverted), and you haven't reversed the changes made on the 25th. At this point, I would suggest no longer using ESE as a primary source. Huntster (t @ c) 14:01, 1 April 2024 (UTC)Reply
Help me understand what is wrong with this edit. EPE planet b page or planet c page used to state host stare age as 0.0167±0.0014 Gyr, but right now it is 0.0167 Gyr
I'm merely reflecting what's on the pages, I have no ability to verify what actual article text states.
And I was not planning to revert changes from the previous run. Instead:
  1. I've made necessary changes that #1 no longer happens (and subsequent import from Exoplanet Archive (Q5420639) restore data from this source)
  2. I've partially fixed #2, rank reverted to normal, reason for deprecated rank (P2241) is still there, but will be removed once I rewrite the corresponding piece of code
And I'm not sure what you mean by "primary source". We can either stop using EPE completely (which would be the shame, because they are updated much more frequently comparing to Exoplanet Archive (Q5420639)), or we can realize that all data sources are not 100% perfect and continue using it with caution. Ghuron (talk) 14:58, 1 April 2024 (UTC)Reply
Then what I would suggest is that, where uncertainty figures exist, do not remove them. While EPE no longer includes them and you cannot verify the article text, which is understandable, that doesn't mean the uncertainty is suddenly wrong and should be removed. Instead, if the whole numbers match, just move on and leave the uncertainty in place. If new numbers come up on EPE, then only cite EPE rather than EPE and the article. This should remove the impression that EPE's whole numbers are all that's available.
I do want to make it clear that I'm not trying to demand anything, because you're doing a great job documenting data on stellar objects, but I do want to try and preserve what data we have to the best precision possible. Huntster (t @ c) 17:48, 1 April 2024 (UTC)Reply
I don't have any idea why Extrasolar Planets Encyclopaedia (Q1385430) cuts out ±ranges and I fully understand your willingness to preserve those exact values that are already imported in Wikidata. However, I think it is very important to be able to constantly synchronize Wikidata with external data sources. The synchronization algorithm is already quite complex. As you can see above, it requires periodic maintenance due to old hidden bugs. But now at least the task itself is concise and compact: this is the source url and it is required to copy as much information as possible from that source into Wikidata.
The change that you are proposing assumes a situation in which, despite the fact that the source says age estimated by a dating method (P7584)0.0167 Gyr, the import retains the value age estimated by a dating method (P7584)0.0167±0.0014 Gyr, because that’s what the source stated in the past. It's not impossible, but it does involve significantly more complexity in the code and is much more opaque algorithm to everyone (including myself :).
It seems to me that a much more promising approach would be to get data from as many sources as possible. In particular, Exoplanet Archive (Q5420639) is a much more reliable source regarding the orbits of exoplanets. The vast majority of confirmed exoplanets include both Extrasolar Planets Encyclopaedia exoplanet ID (P5653) and NASA Exoplanet Archive ID (P5667) identifiers. There is also quite detailed information about host stars, but I haven't used it yet. I think that if I manage to use it, 99% of the problems associated with the voluntaristic decisions of Extrasolar Planets Encyclopaedia (Q1385430) maintainers will no longer bother us. Ghuron (talk) 19:03, 1 April 2024 (UTC)Reply
I don't quite understand what you mean in the second paragraph. I'm making a guess, and forgive me if I'm wrong, but you're saying that because the citation says it comes from EPE, it shouldn't contain the uncertainty value? If that is the case, then there's an immediate problem: I cannot speak to all items, of course, but many of the ones I've edited originally cited just the science article (or article/EPE as separate citations, from when EPE had an uncertainty), but EPE was later added as a secondary citation and the two citations were later merged into a single citation. So the meaning of the citation has changed over time. Huntster (t @ c) 03:10, 2 April 2024 (UTC)Reply
Indeed, the script is now written in such a way that if it finds a statement with a value that matches what is in the external source, it “adopts” this statement, assuming that it will update that statement going forward.
If I understand you correctly, you want to be able to edit any statement manually by “removing” it forever from the scope of my bots. This is usually done by specifying preferred rank, but if that does not suite you, we can discuss other options. Ghuron (talk) 08:29, 2 April 2024 (UTC)Reply
Oh no, I am not proposing to remove it from the scope of your bot forever. I'm suggesting that if an item has a property with value 132+-075, and EPE has a value of 132, the script sees the match in the whole number (the first part) and skips it. If, in the future, EPE is updated with a new value, then it deprecates the old figure and adds in the new one. That way, data is not lost. I certainly agree with utilizing the preferred rank system, and do use it, but I have nowhere near the time to visit every astronomical object and selectively set values to preferred! Huntster (t @ c) 21:35, 3 April 2024 (UTC)Reply
The idea of not adding a new value (wuthout ±) if both "base" value and the source are the same might be interesting.
It complicates "interpretability" of the import, but not that bizarre after all and it might not be as much overhead as I initially thought.
Let me think it through, for the time being I'm pausing EPE import. Ghuron (talk) 08:12, 5 April 2024 (UTC)Reply
@Huntster, could you please review my last edit in YSES 1c (Q108907355) to see whenever I got your idea right? Ghuron (talk) 06:11, 21 April 2024 (UTC)Reply
Yes, that seems to be effective. Nice. I'm curious if this removal was intentional or a side effect somehow? Huntster (t @ c) 05:19, 27 April 2024 (UTC)Reply
R-Band is no longer on SIMBAD Ghuron (talk) 06:53, 27 April 2024 (UTC)Reply