ADS Bibcode errorsEdit

Greetings. I'm looking at the list of "Single value" violations for ADS bibcode (P819) (see [1]) . I think many of them, perhaps most, are a result of edits you made on 23 Aug 2019. I know that's a long time ago. If I'm not mistaken about this, is it possible to revert them en masse? I started individually deleting the error bibcode, but there's too many to continue. See https://www.wikidata.org/w/index.php?title=Q56881547&oldid=1000855830 Many thanks. Trilotat (talk) 15:19, 28 July 2020 (UTC)

@Trilotat: I did this intentionally because those articles are referred by 2 different ADS bibcodes in simbad. For instance: Eleven stars having variable radial velocities (Q56881547) 1909ApJ....29..224. and 1909ApJ....29..224C Ghuron (talk) 15:27, 28 July 2020 (UTC)
Drat! I'll undo my deletes. Sorry. Trilotat (talk) 15:48, 28 July 2020 (UTC)
@Trilotat: Majority of them are very old articles, once I'll finish initial SIMBAD synchronization, I'll leave only "major" bibcode Ghuron (talk) 15:50, 28 July 2020 (UTC)
I don't at all understand what that SIMBAD is, but I've seen it referenced. I am trying to add the bibcode to earth science (geology mostly) article items, but it's so very tedious how I go about it. Should I look into SIMBAD as a tool or is my ignorance so thorough that I'm contemplating an inappropriate use for SIMBAD? Trilotat (talk) 15:54, 28 July 2020 (UTC)
@Trilotat: My understanding is that primary source of ADS bibcodes are https://https://ui.adsabs.harvard.edu/ SIMBAD contains small subset of astronomical articles. I am uploading information from SIMBAD, e.g. Theta2 Tauri (Q6145898) radial velocity (P2216) 47.0 and simbad also provide reference [2] to this fact as 1909ApJ....29..224. Based on that I can add stated in (P248) Eleven stars having variable radial velocities (Q56881547) to the statement above. I don't think SIMBAD contains a lot of information regarding your area of interest Ghuron (talk) 16:05, 28 July 2020 (UTC)
I truly understood nothing of that info you linked to. haha. Thanks, though, for saving me the time spent on researching SIMBAD. Have a great day and be well. Trilotat (talk) 16:12, 28 July 2020 (UTC)

Throttling and maxlagEdit

Hey Ghuron, you seem to be using automation with your regular account to merge items. Unfortunately, your throttling is not compatible with the Wikidata:Bots policy, as you are doing 80--90 edits per minute while the servers are under pressure (maxlag>5). Can you please implement a proper throttling mechanism that pauses editing while maxlag>5 in order to edit in compliance with the bot policy? Thank you, MisterSynergy (talk) 12:27, 10 August 2020 (UTC)

@MisterSynergy: old script, forgot to add maxlag there. Fixed now Ghuron (talk) 17:08, 10 August 2020 (UTC)

Gaia DR2 3695915157453728000Edit

Hello, I have noticed this item and 4 million similar items about stars. Do they have any use? I mean you could import 1,7 billion stars but I don´t see that this is useful. If you need a way to link the Gaia DR2 identifier I´d suggest creating a property like "Gaia DR2 ID" instead that you can link to every star that has an article somehwere if you need to link a source. There is Gaia DR1, soon there will be Gaia EDR3, Gaia DR4 and Gaia DR5 or what ever the names will be with an estimated 10 billion objects combined. This is obviously not a good idea. Can you explain it? --Giftzwerg 88 (talk) 08:20, 9 September 2020 (UTC)

You seems to be assuming that the only legitimate use of wikidata item would be to has "an article somewhere". I don't think it is correct assumption, but even within wikipedias we might be able utilize items that has no sitelinks but are:
In addition to that I do believe that astrodata in this project might be useful for external users as well. I don't know where Stellarium (Q119931) takes their data, but I hope at some point of time they will take it from wikidata and will help to convert their users to our contributors.
I'm not aiming to import 1.6 billions stars from Gaia DR2, it is just not possible from the technical perspective. Instead I'm focusing on importing data from SIMBAD (Q654724), which has more bearable 11 millions objects. I don't know criteria of inclusion into SIMBAD (do you?), but so far they seems to be reasonable. 90%+ items has more than one external identifier. Gaia DR2 3695915157453728000 (Q78739267) right now has only one identifier, but it is relatively bright and close to Earth. I am open to discussion for more strict WD:N criteria for astro-objects, but frankly speaking do not see any need for this. After all, we already have 36M+ for instance of (P31) scholarly article (Q13442814).
Neither I see any necessity for dedicated Gaia DR2 property. I am completely fine querying for p:P528[ps:P528 ?dr2; pq:P972 wd:Q51905050] for any practical purposes. If you see such necessity, please go ahead with property proposal. Ghuron (talk) 09:10, 9 September 2020 (UTC)
I am quite aware that Wikidata has usages other than linking Wikipedia articles, that is why I asked. The point is that these stars probably have many different ID´s, not only a DR2 number but also DR1 number, an EDR3 Number, SIMBAD, USNO, 2MASS, Hipparcos, Tycho... So in short the star has many names, but it would be better to give the star one name and the ID of differen cataloges as properties. I do not have the ultimate solution, but it seems random to give a star the name of the Gaia DR2. I expect the ID of the final Gaia cataloge will be some kind of agreed and widely used identifier, once it is published and there will also be a software solution that connects every other ID from previous catalogues to this final Gaia ID. Someday it will be easier to just link or query directly to the Gaia-Source instead of importing it to Wikidata (except for those objects that have articles of course). The problem with importing is always that everybody can change the data in items at any given time, willfully or by accident. Objects can change names, get merged or deleted. So Wikidata might be useful, but it is not a reliable source to use for scientific projects etc.--Giftzwerg 88 (talk) 18:46, 12 September 2020 (UTC)
AFAIK simbad support ~20000 different catalogs, when importing them to catalog code (P528) I recognize ~1000 most widely used. I don't think 20000 or even 1000 new properties make any sense right now. SIMBAD also has some naming schema (see this - they decided that LSPM is more recognizable than UCAC4) and I generally follow them. It is true that I do not spend a lot of time thinking about which label to assign. I'm more interested in importing data rather than assign labels and I know a lot of people who specialized in wikidata on assigning consistent labels/descriptions. Not my piece of cake.
Yes, ability to run federated queries across multiple catalogs might be helpful, but we are nowhere near that yet. Waiting for SPARQL (Q54871) federated endpoint even for DR2 source would probably take forever. Everybody in astronomy is bound with rusty ADQL that has no federated capabilities. CDS allows you to query for multiple "tables" but they have to host their own copy of data (and they already lagging in synchronization). But in 2020s we going to have hundreds of really big datasets beyond their capabilities so this approach will certainly fail.
So yes, wikidata (as well as wikipedia) probably will never be reliable source to use for scientific projects. Wikidata (as well as wikipedia) is hosting their own copy of data, extracted from reliable sources. Wikidata (as well as wikipedia) data can always be vandalised at any given time, get outdated, etc. But I don't think it should stop us in populating wikidata (or writing wikipedia articles). Unlike wikipedia, it is easier to stand on the shoulders of giants. I can synchronize changes that are made by simbad staff (renaming/merging/deleting) with only a few hundreds lines of python code (not there yet, but approaching it). And there are data, that might be interesting (e.g. notable for wikipedia article) for us and not that interesting (missing or inconsistent) for simbad (exoplanets, binary/multiple star systems, constellations, hierarchical structure of Milky Way (Q321), etc). Ghuron (talk) 04:47, 13 September 2020 (UTC)
As a side notice, I will suggest you to add all catalog codes as aliases. This make items easy to find and reduce the rate of duplicates (for the second point, you should still check them before creation).--GZWDer (talk) 20:01, 17 September 2020 (UTC)
I did aliases in the past, but someone (I don't remember who) reverted it because there were too many. So I decided to leavr this for people who specialized in labels/aliases/descriptions.
And I do check for duplicates but:
  • simbad sometimes (quite often) merge items that initially were separated
  • simbad itself is not perfect and sometimes contains duplicates
  • wdqs replication lag sometimes allows duplicates to slip through
so in order to maintain integrity I am merging tens of thousands of items. Let me know if you see where I can improve. Ghuron (talk) 02:58, 18 September 2020 (UTC)

We sent you an e-mailEdit

Hello Ghuron,

Really sorry for the inconvenience. This is a gentle note to request that you check your email. We sent you a message titled "The Community Insights survey is coming!". If you have questions, email surveys@wikimedia.org.

You can see my explanation here.

MediaWiki message delivery (talk) 18:45, 25 September 2020 (UTC)

Incorrect description "Israeli musician"Edit

Hi Ghuron,

Thank you for removing the incorrect statements noted here. However, the erroneous statements caused bots to add incorrect description to these items. It would be truly appreciated if you could remove these wrong descriptions.

Thanks!

Keren - WMIL (talk) 10:52, 21 January 2021 (UTC)

@Keren - WMIL: I would certainly appreciate a few examples Ghuron (talk) 11:11, 21 January 2021 (UTC)
Thanks for the quick rsponse. See Mordekhai Ron (Q6630190), Q6263995, Q27990274 for example. Keren - WMIL (talk) 11:14, 21 January 2021 (UTC)