Wikidata:Events/IRC office hour 2020-07-21

Participants edit

(Telegram nicknames)

  • amire80
  • Arthur Smith
  • Auregann
  • Csisc1994
  • David Causse
  • Guillaume Lederrey
  • IsmaelOlea
  • Jan_ainali
  • Joseph Allemandou
  • lucaswerkmeister
  • mahir256
  • markus_goellnitz
  • moebeus
  • moedn
  • Nightrose
  • Nikki
  • Sam Alipio
  • Sannita
  • vrandecic
  • zbyszko_p

Content edit

2020-07-21T16:01:16 <Léa ~ Auregann> Hello all, and welcome to the Wikidata & Wikibase office hour! 🎉
2020-07-21T16:01:30 <Sam Alipio> \o/ back again!
2020-07-21T16:01:44 <Léa ~ Auregann> This meeting will last 90 minutes. During the first 30min, we will present the news around the development of Wikidata & Wikibase. Then, we will welcome some guests! Guillaume, David and Zbyszko from WMF’s search team will talk about the Query Service. Finally, we will keep 40min for questions and discussions.
2020-07-21T16:01:53 <Lydia Pintscher> Hey everyone :)
2020-07-21T16:02:10 <Guillaume Lederrey> o/
2020-07-21T16:02:10 <Lucas Werkmeister> hello \o/
2020-07-21T16:02:19 <Jan Ainali> 👋
2020-07-21T16:02:27 <মাহির মোরশেদ> Hello!
2020-07-21T16:02:41 <Denny Vrandečić> Hi there!
2020-07-21T16:02:55 <Léa ~ Auregann> But first of all, let me start with a quick introduction. As no onsite conferences took place this year, we didn’t have our opportunity for an annual reminder that we are real humans behind usernames :) That’s why I thought we could start with presenting the hosts of tonight’s office hour!
2020-07-21T16:03:06 <Léa ~ Auregann> <document>
2020-07-21T16:03:24 <Léa ~ Auregann> @Nightrose Lydia and @Sam_Alipio Sam, product managers respectively for Wikidata and Wikibase, are doing an incredible job on bringing the software behind Wikidata to its best. They deal with two very packed roadmaps, several teams of developers, UX researchers and many other people involved in the development of Wikibase. They make sure that the features we develop make the diverse stakeholders around Wikidata & Wikibase happy, in order to ensure the stability of Wikidata in the future.
2020-07-21T16:03:45 <Léa ~ Auregann> @masssly Mohammed joined the Community Communications team 3 months ago, welcome \o/ Together, we take care of communication between the software development teams and the communities. This includes exchange of information in both directions: sharing what the dev team is working on, announcing the changes to come, and also collecting feedback, issues and wishes coming from you. You can get in touch with us on https://www.wikidata.org/wiki/Wikidata:Contact_the_development_team
2020-07-21T16:04:45 <Léa ~ Auregann> As the Wikidata community keeps growing <3 and we strengthen the Wikibase community, we have more and more channels to follow and requests to address. We spend a lot of time participating in the internal processes of the development team, advocating for the community, making sure that your voices are heard, and you can be sure that we will always take your side and represent you as best as we can during discussions and decisions making.
2020-07-21T16:05:07 <Léa ~ Auregann> We’re all humans - unless you can prove the opposite ;) We are entirely dedicated to Wikidata & Wikibase, and we do nothing but our best. Sometimes, we make mistakes, we make decisions that you don’t like, or just wrong decisions. We are always open to discuss these with you, as long as the interactions stay calm and constructive, and that both sides assume good faith. And I hope that we can work all together to keep this nice atmosphere in digital spaces where we interact with each other :)
2020-07-21T16:05:40 <Léa ~ Auregann> This being said, it’s time to start with the development news!
2020-07-21T16:05:49 <Lydia Pintscher> :D
2020-07-21T16:06:05 <Lydia Pintscher> Quite a few things happened since the last office hour on the development side.
2020-07-21T16:06:43 <Lydia Pintscher> One of the big things is the Wikidata Bridge. The code for the first version is now written and the Catalan Wikipedians are lovely people and agreed to be the first to try out the Bridge.
2020-07-21T16:07:11 <Jan Ainali> \o/
2020-07-21T16:07:13 <Lydia Pintscher> We're currently still dealing with some remaining feedback from the security review and then we're ready to enable it on Catalan Wikipedia.
2020-07-21T16:07:45 <Lydia Pintscher> The first version will not be super powerful but hopefully a good start to build on.
2020-07-21T16:07:54 <Houcemeddine Turki> :)
2020-07-21T16:07:54 <Houcemeddine Turki> This is an achievement
2020-07-21T16:08:13 <Luca Martinelli [Sannita]> fantastic news!
2020-07-21T16:08:21 <Amir Aharoni> reply to <Lydia Pintscher> "One of the big thing…" Com sempre :)
💛❤️💛❤️💛❤️💛❤️💛
2020-07-21T16:08:30 <Lydia Pintscher> Another thing that we worked on was looking at the rest of the web and seeing if we can find references for statements in Wikidata that currently don't have a reference.
2020-07-21T16:08:57 <Jan Ainali> reply to <Lydia Pintscher> "The first version wi…" Is it any test instance now?
2020-07-21T16:09:04 <Lydia Pintscher> The web has a ton of semantic markup of all kinds of interesting data and we worked on making some of that usable for us.
2020-07-21T16:09:23 <Houcemeddine Turki> reply to <Lydia Pintscher> "Another thing that w…" I am already working on the same issue and had a funding from WikiCred
2020-07-21T16:09:34 <Houcemeddine Turki> <document>
2020-07-21T16:09:47 <Léa ~ Auregann> reply to <Jan Ainali> "Is it any test insta…" Here's the description of the test instance with a link :) https://www.mediawiki.org/wiki/Topic:Vct71iey9c81rm6l
2020-07-21T16:10:06 <Lydia Pintscher> We found a lot of potential references 2000 and some are available in a new game in the Wikidata Game. The rest we are polishing right now and then we'll make them available as a dump incl. some analysis of the current decisions in the game.
2020-07-21T16:10:38 <Lydia Pintscher> Another area we worked on is the query builder. SPARQL is awesome and super powerful but also not always easy.
2020-07-21T16:11:22 <Lydia Pintscher> We looked into making that easier a while ago by building a visual query builder.
2020-07-21T16:11:51 <Lydia Pintscher> The research and mockups for the first version of that are almost finished and we've tested it with a few of you. <3 for the feedback.
2020-07-21T16:12:06 <Lydia Pintscher> Coding for that will start in early August and you'll be able to try it as we go along of course.
2020-07-21T16:12:27 <Houcemeddine Turki> reply to <Lydia Pintscher> "Coding for that will…" +1
2020-07-21T16:12:36 <Houcemeddine Turki> This will be interesting
2020-07-21T16:13:11 <Lydia Pintscher> And one final big thing that is in the very early stages still is how we can make it easier for programmers to access our data.
2020-07-21T16:13:44 <Lydia Pintscher> We're doing research around that for a few weeks now and that will conclude in the next quarter and then we'll see how we for example can improve our APIs.
2020-07-21T16:14:34 <Lydia Pintscher> And last but not least we did a lot of smaller fixes like cursor focus on Senses and Forms input.
2020-07-21T16:14:56 <Lydia Pintscher> And with that I'm handing over to Sam who'll talk a bit more about the Wikibase things that happened.
2020-07-21T16:15:03 <Jan Ainali> reply to <Lydia Pintscher> "We're doing research…" Is a paid version in your pipeline as well (like the WM 2030 Strategy is talking about)?
2020-07-21T16:15:11 <Sam Alipio> here's some news from the wonderful world of Wikibases👽
2020-07-21T16:15:14 <Nikki> I appreciate that \o/
2020-07-21T16:15:15 <Houcemeddine Turki> Here, there are two issues Timeout errors and easiness of adding structured data.
2020-07-21T16:15:34 <Houcemeddine Turki> reply to <Lydia Pintscher> "And last but not lea…" Interesting
2020-07-21T16:15:37 <Sam Alipio> we worked on the design system to have a set of unified components for Wikidata/Wikibase that will make it easier to develop new features in the future
2020-07-21T16:15:55 <Sam Alipio> The team continued work on “Federated Properties” (allowing remote Wikibases to access Wikidata’s properties & use them to make statements on local items)
2020-07-21T16:16:00 <Lydia Pintscher> reply to <Jan Ainali> "Is a paid version in…" That's not our focus of this research. Since Wikidata's APIs are part of the general Wikimedia APIs we'll probably somehow have to follow and will be involved in the discussions but details are unclear.
2020-07-21T16:16:17 <Sam Alipio> We have started gathering initial feedback from two institutions who are testing an early version of the federation feature (very exciting)
2020-07-21T16:16:33 <Sam Alipio> We expect to release the completed first version of the federation feature to the community before the end of 2020!
2020-07-21T16:16:52 <Joseph Allemandou> <new_chat_participant,new_chat_member,new_chat_members>
2020-07-21T16:16:55 <Sam Alipio> In other news, we conducted user research on item merge workflows in Wikidata (mostly using Merge.js). This will help us prepare to build a native item merging feature for every Wikibase
2020-07-21T16:17:10 <Jan Ainali> reply to <Lydia Pintscher> "That's not our focus…" I guess that is as good as an answer I could have hoped for, I'll take it :)
2020-07-21T16:17:24 <Sam Alipio> The Wikidata/Wikibase team also completed our first prototyping week of 2020 where we...
2020-07-21T16:17:26 <Lydia Pintscher> reply to <Jan Ainali> "I guess that is as g…" Hehe
2020-07-21T16:17:32 <Sam Alipio> Built a proof of concept for using a GraphQL API to provide access to Wikibase/Wikidata data
2020-07-21T16:17:40 <Markus Göllnitz> reply to <Sam Alipio> "we worked on the des…" software design or UI?
2020-07-21T16:17:43 <Sam Alipio> Prototyped a manifest format that allows toolbuilders to automatically access important configuration information about a Wikibase (to make it easier for tools to be made to work with any Wikibase)
2020-07-21T16:17:51 <Sam Alipio> investigated ranking for Items to order them by their relevance in a query result
came up with a workflows for editing statements linking to other Items in the Wikidata Bridge
2020-07-21T16:17:52 <Houcemeddine Turki> reply to <Lydia Pintscher> "That's not our focus…" Wikimedia API had success because it is free. This is a matter.
2020-07-21T16:17:59 <Sam Alipio> Assessed the Wikidata item page against a set of accessibility standards to identify key areas for improvement
2020-07-21T16:18:13 <Sam Alipio> and finally, prototyped design system components to continue improving consistency
2020-07-21T16:18:32 <Léa ~ Auregann> reply to <Markus Göllnitz> "software design or U…" UI components
2020-07-21T16:18:40 <Sam Alipio> Now back to Léa
2020-07-21T16:18:41 <Amir Aharoni> reply to <Sam Alipio> "In other news, we co…" <sticker>
2020-07-21T16:18:48 <Léa ~ Auregann> Thanks Lydia & Sam for these news!
2020-07-21T16:19:01 <Léa ~ Auregann> Now, for a few more news outside of the software development area
2020-07-21T16:19:15 <Léa ~ Auregann> Wikidata has a new admin: Stanglavine, welcome back onboard!
2020-07-21T16:19:18 <Amir Aharoni> reply to <Sam Alipio> "In other news, we co…" I am so glad to hear this.
2020-07-21T16:19:30 <Léa ~ Auregann> we also got some new CheckUsers: علاء, BRPever, Sotiale, Jasper Deng, welcome onboard!
2020-07-21T16:19:45 <Léa ~ Auregann> Tons of events happened online, despite the difficult situation the community continued being amazing and doing outreach for Wikidata: remote hackathon, WikidataLab, plenty of livestream sessions, videos, podcasts, editathons… Thank you so much <3
2020-07-21T16:19:51 <Markus Göllnitz> reply to <Sam Alipio> "Built a proof of con…" cool
2020-07-21T16:19:59 <Ismael Olea> Hi, I'd need some feedback about a property modification: https://www.wikidata.org/wiki/Wikidata:Project_chat#Modification_of_values_for_identificador_Patrimonio_Inmueble_de_Andaluc%C3%ADa_(P3318)
2020-07-21T16:20:03 <Léa ~ Auregann> Speaking of events: Wikidata’s birthday is happening in October: it’s gonna be a great distributed remote event, you can be part of it! https://www.wikidata.org/wiki/Wikidata:Eighth_Birthday Join the Telegram group if you want to get fresh updates https://t.me/joinchat/HGjGexK8LA2wJZEk1x1p_A
2020-07-21T16:20:36 <Léa ~ Auregann> Winners of the Wikidata competition during Museum Day 2020: VIGNERON, Alexmar983, Braveheart, Pasleim, Benoît Prieur, Uli.ch, Nono314, Sukkoria and Airon90. Congratulations! https://meta.wikimedia.org/wiki/Wikimedia/Museum_Day_2020/Wikidata_Competition/Winners
2020-07-21T16:20:54 <Léa ~ Auregann> There are now over 3 million uses of the Wikidata Infobox in Commons categories \o/
2020-07-21T16:21:03 <Sam Alipio> reply to <Léa ~ Auregann> "Winners of the Wikid…" 🎊
2020-07-21T16:21:04 <Léa ~ Auregann> https://commons.wikimedia.org/wiki/Category:Uses_of_Wikidata_Infobox
2020-07-21T16:21:12 <Lucas Werkmeister> reply to <Léa ~ Auregann> "There are now over 3…" 🎉
2020-07-21T16:21:22 <Lydia Pintscher> \o/
2020-07-21T16:21:23 <Léa ~ Auregann> Abstract Wikipedia was approved as a future Wikimedia project. We don’t know exactly yet how it will connect with Wikidata, but we will certainly figure it out together https://meta.wikimedia.org/wiki/Abstract_Wikipedia
2020-07-21T16:21:38 <Lucas Werkmeister> very exciting!
2020-07-21T16:21:45 <Léa ~ Auregann> Now as usual, over the past months, plenty of tools got created or improved by the community <3
2020-07-21T16:21:45 <Lydia Pintscher> \o/
2020-07-21T16:21:45 <Denny Vrandečić> We certainly will!
2020-07-21T16:21:52 <Jan Ainali> That should have been the headline :)
2020-07-21T16:21:52 <Houcemeddine Turki> reply to <Sam Alipio> "Prototyped a manifes…" Are you working on develop a data model for Wikibase for each purpose (Library Management, Health Records...). E.g. WikibaseLib.
2020-07-21T16:21:56 <Léa ~ Auregann> Here are a few of them:
2020-07-21T16:22:08 <Léa ~ Auregann> COVID19 dashboard https://sites.google.com/view/covid19-dashboard/
2020-07-21T16:22:24 <Léa ~ Auregann> Anagram generator based on Wikidata names http://apf.geobib.fr/
2020-07-21T16:22:29 <Denny Vrandečić> Is there a link to the Query builder design and research?
2020-07-21T16:22:42 <Léa ~ Auregann> LexData – a python library to edit Lexicographical data – has been released in a major new version https://nudin.github.io/LexData/
2020-07-21T16:22:45 <Houcemeddine Turki> reply to <Léa ~ Auregann> "Winners of the Wikid…" My Congratulations.
2020-07-21T16:23:01 <Sam Alipio> reply to <Houcemeddine Turki> "Are you working on d…" I will speak a little more later in this office hour on how we plan to continue this prototype -- it is mostly limited to making it easier for toolbuilders to automatically detect the configuration of a Wikibase to make tools work outside of Wikidata
2020-07-21T16:23:09 <Léa ~ Auregann> Structured Search, a tool allowing you to search through Wikimedia Commons using structured data https://tools.wmflabs.org/hay/sdsearch/
2020-07-21T16:23:21 <Léa ~ Auregann> script by Tohaomg to easily rearrange the order of values for statements in Wikidata https://www.wikidata.org/wiki/User:Tohaomg/rearrange_values.js
2020-07-21T16:23:36 <Léa ~ Auregann> MyCroft, the open digital personal assistant now has a Wikidata skill to answer questions about current and historic facts & information about a person. https://store.kde.org/p/1389861
2020-07-21T16:23:44 <Léa ~ Auregann> Wolfram Alpha now allows you to access data from Wikidata using the Wolfram Language in a ton of really interesting ways: https://blog.wolfram.com/2020/07/09/accessing-the-world-with-the-wolfram-language-external-identifiers-and-wikidata/
2020-07-21T16:23:56 <Léa ~ Auregann> and finally, you can generate an almanach with Wikidata: http://vintagedata.org/almanac/query.php
2020-07-21T16:24:03 <Sam Alipio> reply to <Léa ~ Auregann> "Wolfram Alpha now al…" this is so cool
2020-07-21T16:24:27 <Lydia Pintscher> reply to <Denny Vrandečić> "Is there a link to t…" Unfortunately not yet updated but here is the latest stuff: https://www.wikidata.org/wiki/Wikidata:Improve_the_workflows_for_queries_and_lists/Simple_query_Builder Will be updated soon (TM)
2020-07-21T16:24:32 <Léa ~ Auregann> Of course, we also got a bunch of new WikiProjects: Biodiversity (formerly iNaturalist), Speed skating, Witches, Schools, Public art, Sweden, Ennegreciendo/Noircir Wikimedia - check them out! WikiProjects are the place to be to work on a specific topic on Wikidata
2020-07-21T16:24:38 <Denny Vrandečić> Thanks!
2020-07-21T16:24:52 <Léa ~ Auregann> And finally, here are a few interesting things to watch and read:
2020-07-21T16:24:59 <Léa ~ Auregann> Editing Wikidata and creating a property proposal https://www.youtube.com/watch?v=TAb_AvPRqj4
2020-07-21T16:25:06 <Léa ~ Auregann> Lydia presenting shape expressions https://www.youtube.com/watch?v=oXva9u3V0lc (more videos from the course here https://www.wikidata.org/wiki/Wikidata:Status_updates/2020_05_11#Press,_articles,_blog_posts,_videos )
2020-07-21T16:25:11 <Léa ~ Auregann> Defying Wikidata: Validation of terminological relations in the web of data https://aran.library.nuigalway.ie/handle/10379/15919
2020-07-21T16:25:20 <Léa ~ Auregann> A band name generator with Wikidata https://www.textjuicer.com/2020/05/bandaid-a-band-name-generator/
2020-07-21T16:26:06 <Léa ~ Auregann> If you want to know more about the cool things that other community members are doing, or add more to the list: please consider subscribing & contributing to the Wikidata newsletter! You will make @masssly very happy :) https://www.wikidata.org/wiki/Wikidata:Status_updates/Next
2020-07-21T16:26:27 <Léa ~ Auregann> And now, let's talk about what's coming next :)
2020-07-21T16:26:35 <Houcemeddine Turki> reply to <Lydia Pintscher> "Unfortunately not ye…" Still interesting
2020-07-21T16:27:10 <Lydia Pintscher> Ok so what's next? As I said earlier the Bridge will be deployed on the first Wikipedia and we'll expand it based on their feedback.
2020-07-21T16:27:28 <Lydia Pintscher> We'll code the first version of the query builder and test it with you all.
2020-07-21T16:28:02 <Lydia Pintscher> And we'll continue with the research (and then hopefully also implementation) for making it easier to access the data in Wikidata for programmers.
2020-07-21T16:28:10 <Lydia Pintscher> Sam: how about the Wikibase side?
2020-07-21T16:29:04 <Sam Alipio> once we complete the major engineering behind "federated properties", we have some fun planned
2020-07-21T16:29:17 <Sam Alipio> we will be starting development work on a production version of “automated configuration discovery for toolbuilders” (continuing the previously mentioned prototype we built in April)
2020-07-21T16:29:28 <Sam Alipio> The prototype can be explored here https://github.com/wmde/WikibaseManifest
2020-07-21T16:29:52 <Sam Alipio> the goal is to make it easier for toolbuilders to get their tools to work with custom Wikibase instances, increasing the reach and impact of valuable Wikidata tools!
2020-07-21T16:30:13 <Sam Alipio> and now on to our next topic...
2020-07-21T16:30:22 <Moe> reply to <Sam Alipio> "the goal is to make …" ❤️❤️
2020-07-21T16:30:36 <Léa ~ Auregann> Yes! We have some guests tonight 🎉
2020-07-21T16:31:20 <Léa ~ Auregann> please welcome Guillaume, David and Zbyszko from WMF’s search team who will talk about the present and the future of the Query Service!
2020-07-21T16:31:32 <Sam Alipio> welcome esteemed guests 😄
2020-07-21T16:31:40 <Houcemeddine Turki> Hello
2020-07-21T16:31:50 <মাহির মোরশেদ> \o/ 🎉
2020-07-21T16:31:52 <Guillaume Lederrey> Hello all! I'm Guillaume, engineering manager for the Search Platform team
2020-07-21T16:32:35 <Guillaume Lederrey> The team name is somewhat misleading: while we do take care of search, these days we are spending most of our time on Wikidata Query Service
2020-07-21T16:32:51 <Guillaume Lederrey> Joining me are Zbyszko and David:
2020-07-21T16:33:40 <David Causse> Hey all, I'm David (dcausse) working on wdqs and also CirrusSearch
2020-07-21T16:34:04 <Zbyszko Papierski> Hey I'm Zbyszko, from Poland, I work mostly on WDQS
2020-07-21T16:34:29 <Guillaume Lederrey> Thanks for inviting us!
2020-07-21T16:34:55 <Guillaume Lederrey> We also have Search Platform office hours monthly, advertised on the discovery mailing list
2020-07-21T16:35:48 <Guillaume Lederrey> What we've been working on lately:
2020-07-21T16:36:03 <Zbyszko Papierski> We will be launching beta endpoint for Wikimedia Commons Query Service - SPARQL endpoint, based on WDQS that will allow querying Structured Data on Commons data and use federation with WDQS. Service will require a Commons account and may be prone to unexpected outages. Service (and endpoint) should be announced soon. Keep in mind that, while we do want to move this service to production - it is currently unknown when will that happen. Additionally, the same limitations as WDQS’s will apply.
2020-07-21T16:36:05 <Houcemeddine Turki> reply to <Guillaume Lederrey> "We also have Search …" Interesting idea
2020-07-21T16:36:46 <Guillaume Lederrey> One of our ongoing major project is working on making WDQS more robust and scalable.
2020-07-21T16:36:57 <Jan Ainali> reply to <Guillaume Lederrey> "We also have Search …" Is it also announced/listed on-wiki somewhere?
2020-07-21T16:37:12 <Houcemeddine Turki> reply to <Zbyszko Papierski> "We will be launching…" This is what I talked about. Is there any work about having a solution to timeout limits
2020-07-21T16:37:45 <Lucas Werkmeister> reply to <Jan Ainali> "Is it also announced…" I remember it’s regularly announced on some mailing list, at least (wikitech-l? don’t remember)
2020-07-21T16:38:19 <Guillaume Lederrey> it is announced on the Discovery mailing list and we're planning on announcing it on the wikidata mailing list as well
2020-07-21T16:38:39 <Guillaume Lederrey> The main effort in scaling is a new streaming updater
2020-07-21T16:39:16 <David Causse> The rewrite of the WDQS updater is in progress, a test server will be setup soon to start collecting performance metrics, the goal is to support spikes of ~1000 edits/minute and stop being a bottleneck for bots. You can follow T244590 for regular updates.
2020-07-21T16:39:34 <Jan Ainali> reply to <Guillaume Lederrey> "it is announced on t…" Ok, I will miss them until then. Not signing up for a new mailing list just for that. Would've been happy to add any page to my watchlist though.
2020-07-21T16:39:59 <Houcemeddine Turki> reply to <David Causse> "The rewrite of the W…" This is a work :)
2020-07-21T16:40:00 <মাহির মোরশেদ> reply to <Zbyszko Papierski> "We will be launching…" Given that only fairly recently have proper dumps been made of Commons structured data, what's the ETA on the data accessible from this endpoint being updated live in the same way as Wikidata's Query Service?
2020-07-21T16:40:25 <Guillaume Lederrey> As part of scaling, we need a better understanding of the patterns of data and access.
2020-07-21T16:40:34 <Guillaume Lederrey> So we brought our on guest: Joseph
2020-07-21T16:40:49 <Lucas Werkmeister> reply to <Jan Ainali> "Ok, I will miss them…" https://etherpad.wikimedia.org/p/Search_Platform_Office_Hours suggests it’s usually the first Wednesday of the month, if that helps…?
2020-07-21T16:41:02 <Zbyszko Papierski> reply to <Houcemeddine Turki> "This is what I talke…" Issue is quite complex on our side - we are currently working on medium to long term solution. It will definitely involve more changes, both to WCQS and WDQS. As mentioned, our current priority is streaming updater, but scalabilty issues (one of the reasons for timeout limits) of query service itself are soon to follow.
2020-07-21T16:41:05 <Joseph Allemandou> Hi :) I'm the guests's guest ;)
2020-07-21T16:41:37 <Joseph Allemandou> I work at the WMF in the analytics team, mostly on hadoop and distributed computation.
Lately I have done some analysis on WDQS usage (see https://wikitech.wikimedia.org/wiki/User:Joal/WDQS_Traffic_Analysis)
2020-07-21T16:41:37 <Houcemeddine Turki> reply to <Zbyszko Papierski> "Issue is quite compl…" That is excellent work
2020-07-21T16:41:39 <Markus Göllnitz> reply to <Joseph Allemandou> "Hi :) I'm the guests…" Do you have any guests?
2020-07-21T16:41:46 <Joseph Allemandou> :D
2020-07-21T16:42:04 <Léa ~ Auregann> reply to <Joseph Allemandou> "Hi :) I'm the guests…" The guests' guests are our guests :)
2020-07-21T16:42:07 <Houcemeddine Turki> :)
2020-07-21T16:42:10 <Sam Alipio> reply to <Markus Göllnitz> "Do you have any gues…" 😄
2020-07-21T16:42:11 <Jan Ainali> reply to <Lucas Werkmeister> "https://etherpad.wik…" It helps me. But I cannot add it to m:Events calendar if it's not on a wiki page
2020-07-21T16:42:20 <Joseph Allemandou> I'm also trying to build a POC of answering sparql queries using spark (distributed ocmputation engine), but I'm not yet there :)
2020-07-21T16:43:26 <Zbyszko Papierski> reply to <মাহির মোরশেদ> "Given that only fair…" We don't know yet - that's definitely a plan, given growth of SDoC volume, we are going to start addressing that soon.
2020-07-21T16:43:35 <Guillaume Lederrey> reply to <Jan Ainali> "It helps me. But I c…" We haven't had a need for a wiki page yet, but we can probably create one. I'll get back to you on that!
2020-07-21T16:43:56 <Joseph Allemandou> In the link I pasted above there are plenty graphs, I'll try to summarize some findingds here: the vast majority of computation-time (time taken to compute a query) is taken by a very samll number of requests
2020-07-21T16:44:16 <Jan Ainali> reply to <Guillaume Lederrey> "We haven't had a nee…" Are you working on something that is not described on any wiki?
2020-07-21T16:45:26 <Markus Göllnitz> reply to <wikilinksbot> "[[m:Events calendar]…" Is that exposed via iCalendar?
2020-07-21T16:46:10 <Joseph Allemandou> Something else interesting to note is that there are some bots doing a lot fo requests, and those are usually fast, while requests that are slow (more than 1s even more than 10s), are run from user-agents doing a small number of queries (less than 10) - and those user-agent are human-style
2020-07-21T16:46:44 <Jan Ainali> reply to <Markus Göllnitz> "Is that exposed via …" No, looks like json on a wikipage m:Events calendar/about
2020-07-21T16:46:48 <Léa ~ Auregann> Let's keep the discussions about the calendar out for the moment if you want, so we can focus on the content of the office hour :)
2020-07-21T16:47:49 <Guillaume Lederrey> Next steps in terms of scaling is to finish our work on the new streaming updater. And then we need more analysis to understand what the options are
2020-07-21T16:48:04 <Houcemeddine Turki> reply to <Joseph Allemandou> "Something else inter…" This is absolutely explained that SPARQL queries are used to infer new knowledge
2020-07-21T16:48:40 <Houcemeddine Turki> E.g. We can find X P2176 Y where Y P2175 X does not exist.
2020-07-21T16:48:50 <Léa ~ Auregann> Thanks a lot, guests & guests!
2020-07-21T16:49:18 <Léa ~ Auregann> Now, we have 40min for all of our questions on any kind of Wikidata- and Wikibase-related topic :)
2020-07-21T16:49:31 <Léa ~ Auregann> Feel free to ask anything and we will find the best person to answer!
2020-07-21T16:50:18 <Houcemeddine Turki> Just continuing what Joseph has said.
2020-07-21T16:50:20 <মাহির মোরশেদ> Is work on arbitrary access to lexicographical data still stalled?
2020-07-21T16:51:11 <Houcemeddine Turki> Some bots are using SPARQL queries to infer inverse relations and add them to Wikidata.
2020-07-21T16:51:27 <Lydia Pintscher> reply to <মাহির মোরশেদ> "Is work on arbitrary…" Unfortunately yes. But I'm trying to get it unblocked once we have the Bridge out.
2020-07-21T16:51:34 <Houcemeddine Turki> Is there any work to automate this interesting process.
2020-07-21T16:52:03 <Lydia Pintscher> reply to <Houcemeddine Turki> "Is there any work to…" Not from the dev team side, no.
2020-07-21T16:52:06 <মাহির মোরশেদ> reply to <Houcemeddine Turki> "Is there any work to…" The oppose votes on m:Community_Wishlist_Survey_2019/Wikidata/Automatically_add_inverse_properties (one of them mine) suggest that this would be a bad idea
2020-07-21T16:52:30 <Lydia Pintscher> Thanks for the link :)
2020-07-21T16:52:36 <Lydia Pintscher> That's one of the reasons.
2020-07-21T16:53:01 <Houcemeddine Turki> reply to <মাহির মোরশেদ> "The oppose votes on …" I see. However, this is one of the reasons why bots are using SPARQL.
2020-07-21T16:54:01 <Jan Ainali> What's the status for the watchlist improvements? T90435 and associated tickets
2020-07-21T16:54:46 <Houcemeddine Turki> reply to <wikilinksbot> "[[m:Community_Wishli…" I opened it and found that most people voted support. :)
2020-07-21T16:55:01 <Lydia Pintscher> reply to <Jan Ainali> "What's the status fo…" As part of the work on Bridge this became more pressing. I had it on my list for the next prototyping week (which is next week) but unfortunately it didn't make the cut so we'll have to see how else to tackle it.
2020-07-21T16:55:38 <Jan Ainali> reply to <Lydia Pintscher> "As part of the work …" At least not completely forgotten then. That's hopeful!
2020-07-21T16:55:49 <Léa ~ Auregann> By the way, if you have any questions about Search and WDQS later on, we created a contact page here: https://www.wikidata.org/wiki/Wikidata:Contact_the_development_team/Query_Service_and_search
2020-07-21T16:55:50 <Lydia Pintscher> Oh absolutely not forgotten :D
2020-07-21T16:56:06 <Lydia Pintscher> It's clearly important. Just so many important things and so little time :(
2020-07-21T16:56:46 <Houcemeddine Turki> Another idea is the use of SPARQL to analyze labels of scholarly publications and link research papers to their topics using P921
2020-07-21T16:57:55 <Arthur Smith> Guillaume and company, thanks for being here today. Do you have a good feel for how the balance between updates and (heavy) queries is affecting the WDQS updating? It's been particularly bad today, with a period of 2 hours earlier today when there was only about 8 minutes when bot edits could be done (maxlag < 5). So there can't be that many updates going on, it must be query load that's delaying the updates?
2020-07-21T16:58:12 <Arthur Smith> Or maybe somebody's evading the maxlag limit...
2020-07-21T16:58:47 <Houcemeddine Turki> reply to <Arthur Smith> "Guillaume and compan…" +1
2020-07-21T16:59:11 <Moebeus> @Nightrose I've been curious about https://www.wikidata.org/wiki/Property:P2559 for awhile. What is that "corresponding feature" exactly, is it related to anything that you guys do or more in the hobbyist space? If it's not relevant to this office hour I apologize in advance
2020-07-21T16:59:56 <মাহির মোরশেদ> reply to <Arthur Smith> "Or maybe somebody's …" Related to that, I've noticed that a lot of background (and some browser-based) QS batches run incredibly fast, akin to what I would expect from bots (see User:Wiki13's contribution history for an example). Is there any way to temper this?
2020-07-21T17:01:46 <David Causse> reply to <Arthur Smith> "Guillaume and compan…" A good feel is a bit strong but we noticed that when the load of the machines is up to >35 the lag starts to rise. Load is definitely caused by queries. The streaming updater will help on this. As for the complex queries this is part of a much broader analysis of usage patterns.
2020-07-21T17:01:47 <Houcemeddine Turki> reply to <মাহির মোরশেদ> "Related to that, I'v…" I do not think that we should slow down QS.
2020-07-21T17:01:58 <Lydia Pintscher> reply to <Arthur Smith> "Or maybe somebody's …" There are definitely people not respecting it and it is a bit like playing whack-a-mole unfortunately. One of the underlying problems is that the Bot group is set to have no ratelimit. We are currently looking into changing that so bots actually have a technically enforced ratelimit of say 90 edits/sec.
2020-07-21T17:02:20 <Lydia Pintscher> That will hopefully help as well but happy to discuss.
2020-07-21T17:03:07 <Nikki> per second or per minute?
2020-07-21T17:03:18 <Lydia Pintscher> eh sorry
2020-07-21T17:03:20 <Lydia Pintscher> minute of course :D
2020-07-21T17:03:22 <Houcemeddine Turki> 90 edits per second is not bad
2020-07-21T17:03:23 <Nikki> :D
2020-07-21T17:03:26 <Lydia Pintscher> https://phabricator.wikimedia.org/T258354 is the corresponding ticket
2020-07-21T17:03:54 <Markus Göllnitz> reply to <Houcemeddine Turki> "90 edits per second …" it is bad
2020-07-21T17:04:29 <Lydia Pintscher> reply to <Moebeus> "@Nightrose I've bee…" This was created because people were (rightfully) upset about usage instructiontions being written into the description field. That makes descriptions pretty useless for the rest of the world.
2020-07-21T17:04:38 <Houcemeddine Turki> reply to <Markus Göllnitz> "it is bad" I see. :)
2020-07-21T17:04:52 <Lydia Pintscher> What's missing now is using this new property in the UI of Wikidata.
2020-07-21T17:05:02 <Lydia Pintscher> We have not gotten to this yet unfortunately :(
2020-07-21T17:05:17 <Lucas Werkmeister> reply to <Lydia Pintscher> "This was created bec…" for example ^^ https://twitter.com/ReaderMeter/status/814254699805339648
2020-07-21T17:05:18 <Markus Göllnitz> reply to <Lydia Pintscher> "This was created bec…" This small world outside Wikidata?
2020-07-21T17:05:29 <Lydia Pintscher> reply to <Markus Göllnitz> "This small world out…" :D
2020-07-21T17:05:34 <Lydia Pintscher> That one
2020-07-21T17:05:46 <Léa ~ Auregann> Any more questions for the Wikidata & Wikibase development team or the Search Platform team?
2020-07-21T17:06:12 <Moebeus> reply to <Lydia Pintscher> "We have not gotten t…" is there a Phabricator ticket ?
2020-07-21T17:06:24 <Houcemeddine Turki> Are you interesting in adding support of news items.
2020-07-21T17:06:24 <Lydia Pintscher> reply to <Moebeus> "is there a Phabricat…" Let me see
2020-07-21T17:07:23 <Lydia Pintscher> reply to <Moebeus> "is there a Phabricat…" https://phabricator.wikimedia.org/T140131
2020-07-21T17:07:32 <Markus Göllnitz> reply to <Sam Alipio> "and finally, prototy…" Will it be influenced by and or influence the UI for SDC? If the the former, will we still get links to the items/properties in the suggestion dropdowns.
2020-07-21T17:07:54 <Lydia Pintscher> reply to <Houcemeddine Turki> "Are you interesting …" Do you mean a new entity type like Item/Property/... for news?
2020-07-21T17:07:55 <Moebeus> reply to <Lydia Pintscher> "https://phabricator.…" nice! thank you
2020-07-21T17:08:01 <Arthur Smith> I guess I'm interested in how the WDQS team will verify that the new updater is behaving correctly - can you compare triples in detail, or is there a sample collection of items that would be used, or ...?
2020-07-21T17:08:18 <Houcemeddine Turki> reply to <Lydia Pintscher> "Do you mean a new en…" Yes
2020-07-21T17:08:49 <Jan Ainali> reply to <Houcemeddine Turki> "Yes" Wikinews sitelinks are not good enough?
2020-07-21T17:09:11 <Houcemeddine Turki> reply to <Jan Ainali> "Wikinews sitelinks a…" No
2020-07-21T17:09:32 <মাহির মোরশেদ> How much does the Search Platform team foresee continuing development on Blazegraph in earnest, given that Wikidata so highly depends on it and that external development on it seems to be dormant otherwise?
2020-07-21T17:09:33 <Houcemeddine Turki> I am dealing with news facts.
2020-07-21T17:09:36 <David Causse> reply to <Arthur Smith> "I guess I'm interest…" We will have the streaming updater running on a test server, this server will be used first for perf evaluation and then later I hope we can make it accessible to the community for testing.
2020-07-21T17:09:49 <Lydia Pintscher> reply to <Markus Göllnitz> "Will it be influence…" It will not immediately influence it - possibly the team working on commons will pick it up. It'll not change the suggestions.
2020-07-21T17:10:02 <Houcemeddine Turki> Not items about them in Wikinews.
2020-07-21T17:10:37 <Moebeus> reply to <Jan Ainali> "Wikinews sitelinks a…" hint: Houcemeddine Turki talked about this on the Wikispore Day stream you were both on 😉
2020-07-21T17:10:45 <Lydia Pintscher> reply to <Houcemeddine Turki> "Yes" There are currently no plans because there wasn't demand I've seen.
2020-07-21T17:11:13 <Arthur Smith> David sounds good - I'll look forward to testing it!
2020-07-21T17:11:28 <Houcemeddine Turki> reply to <Lydia Pintscher> "There are currently …" I see. Is there a way to apply.
2020-07-21T17:12:25 <Lydia Pintscher> reply to <Houcemeddine Turki> "I see. Is there a wa…" Of course. Making a plan and trying to convince people that it's more important than the other things people have been asking for here today.
2020-07-21T17:12:27 <Guillaume Lederrey> reply to <মাহির মোরশেদ> "How much does the Se…" We're contributing some bug fixes to Blazegraph, but we're not interested in supporting Blazegraph itself. We are interested in evaluating alternatives to Blazegraph, but we haven't found a compelling enough alternative yet. That being said, we are also trying to reduce our dependency on Blazegraph, for example our new updater is isolated from Blazegraph as much as possible and most of it could be reused with a different backend.
2020-07-21T17:13:08 <Arthur Smith> Joseph I've been looking at your analysis page, very interesting. Who exactly uses the "internal cluster" - is it purely Wikidata, or other Wikimedia services? I assume it's not available to toolforge tools?
2020-07-21T17:13:43 <Joseph Allemandou> Hi Arthur - I let Guillaume or David answef :)
2020-07-21T17:13:56 <Joseph Allemandou> Thanks for looking Arthur
2020-07-21T17:14:09 <Guillaume Lederrey> reply to <Arthur Smith> "Joseph I've been loo…" The internal cluster is used only internally by WMF to integrate with other production services. It has stronger restrictions (for example shorter timeouts) to provide a bit more predictability.
2020-07-21T17:14:43 <Houcemeddine Turki> reply to <Lydia Pintscher> "Of course. Making a …" I see.
2020-07-21T17:15:28 <David Causse> reply to <Guillaume Lederrey> "The internal cluster…" usecases are wikidata constraint checks and the "deepcat" search keywords available on all wikis.
2020-07-21T17:15:49 <মাহির মোরশেদ> reply to <Guillaume Lederrey> "We're contributing s…" With respect to Blazegraph alternatives, what criteria are you using to evaluate them?
2020-07-21T17:17:22 <Guillaume Lederrey> reply to <মাহির মোরশেদ> "With respect to Blaz…" We haven't started a formal evaluation yet. The criteria should include scaling strategy, support of our existing feature set (including the ability to extend the service), ease of administration, ...
2020-07-21T17:17:57 <Houcemeddine Turki> reply to <Guillaume Lederrey> "We haven't started a…" +1
2020-07-21T17:17:59 <Markus Göllnitz> reply to <Lydia Pintscher> "As part of the work …" Is it a general what improvements people wish for watchlist? If so, is grouped EditGroups on the list? That should reduce a lot of flooding without hiding edits. 🤔
2020-07-21T17:18:47 <Lydia Pintscher> reply to <Markus Göllnitz> "Is it a general what…" It is a general wishlist :D Integrating EditGroups more tightly into Wikibase or even MediaWiki is definitely a wish on the table.
2020-07-21T17:18:56 <Lydia Pintscher> But unfortunately not on the trivial side at all.
2020-07-21T17:19:07 <Lydia Pintscher> But it could definitely help a lot I agree.
2020-07-21T17:20:34 <Lydia Pintscher> https://phabricator.wikimedia.org/T203557 is one ticket in that direction.
2020-07-21T17:23:06 <Zbyszko Papierski> reply to <Guillaume Lederrey> "We haven't started a…" Note that formal evaluation will be based on use cases and it's possible that some of them will be better served by other means than WDQS (and Blazegraph in particular) - which means that possible replacement will possibly not be 1-to-1.
2020-07-21T17:24:39 <Léa ~ Auregann> We're approaching the end of the office hour, thanks a lot for joining and asking questions, thanks also to our guests :)
2020-07-21T17:25:16 <Sam Alipio> reply to <Léa ~ Auregann> "We're approaching th…" 👏thanks search team guests (and guest of guests)
2020-07-21T17:25:31 <Lydia Pintscher> ❤
2020-07-21T17:25:34 <Guillaume Lederrey> Thanks for the invitation!
2020-07-21T17:26:24 <Arthur Smith> <sticker>