Wikidata:Bot requests/Archive/2018/11
This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Fixing URLs in Sources
Request date: 12 September 2018, by: MichaelSchoenitzer
- Task description
After an update of the software hosting the site all the links to tags in the gnome-gitlab don't work anymore. They are used heavily in sources for version-numbers. Can someone update the 240 links with an bot/script? From
https://git.gnome.org/browse/([^/]*)/tag/?h=(.*)
to
https://gitlab.gnome.org/GNOME/$1/tags/$2
Here's a query searching for the cases:
select ?item ?st ?url where {
?item p:P348 ?st.
?st prov:wasDerivedFrom ?src.
?src pr:P854 ?url.
FILTER CONTAINS(STR(?url), "https://git.gnome.org/browse")
}
- Discussion
- Request process
Task completed (00:27, 2 November 2018 (UTC))
- This section was archived on a request by: MichaelSchoenitzer (talk) 00:27, 2 November 2018 (UTC)
Import SL-language items into the WikiData medical repository
- Request process
Request date: 26 November 2018, by: Vitosmo
- Link to discussions justifying the request
https://sl.wikipedia.org/wiki/Uporabni%C5%A1ki_pogovor:Vitosmo#Bot_Request
- Task description
a flat file of items localized for the SL language is to be imported into the WikiData repository
- Licence of data to import (if relevant)
- Discussion
request by Vitosmo (talk) 12:46, 26 November 2018 (UTC)
I've created a request for permission.
Two notes about the table:
- There are two nearly empty lines only containing the string
sl
. - A few names contain parentheses. Normally text in parentheses is used to distinguish different pages with the same title but this is not necessary in Wikidata because there are descriptions.
Do you want to update the table?
--Pyfisch (talk) 16:24, 1 December 2018 (UTC)
- I'll take care of the two questions by tomorrow at the latest. Sorry for the delay
- Vitosmo (talk) 21:43, 14 December 2018 (UTC)
- Pyfisch: corrected the two errors indicated - danke und los geht's(g)
- Pinky sl: Passed the texts through the spellcheck/Besana strainer
- Vitosmo (talk) 10:29, 15 December 2018 (UTC)
Should the old sl labels be set as an alias? --Pyfisch (talk) 12:38, 7 December 2018 (UTC)
- I work with Vitosmo on this request. I am also an admin on sl wiki. You can set old sl labels as alias. --Pinky sl (talk) 07:37, 8 December 2018 (UTC)
- Request process
- FischBot 6
Task completed (19:19, 18 December 2018 (UTC))
- This section was archived on a request by: Pyfisch (talk) 19:19, 18 December 2018 (UTC)
move descriptions in German from English to German description
Special:Search/Beruf/Funktion seems to find a lot. Sample edit: [1] --- Jura 16:07, 2 November 2018 (UTC)
{{Section resolved|Manually moved 9 entries. Can't find more German descriptions in the English field. Pyfisch (talk) 17:32, 1 December 2018 (UTC)}}
- @Pyfisch: did you click on the search link ? It currently gives 15,607 results and any I clicked on aren't in English. Maybe you need to change your interface language to English --- Jura 05:52, 4 December 2018 (UTC)
- Here is a query that finds some [2]. --- Jura 06:00, 4 December 2018 (UTC)
- Interesting. I searched with "Pages in this language: English" which only shows a single result. But if I switch the interface language to English I get a whole lot of entities. Thanks for the SPARQL query! I am not sure though if we want to move all these descriptions to German because they are rather long unlike other descriptions that serve to disambigulate between persons of the same name. --Pyfisch (talk) 09:53, 4 December 2018 (UTC)
- Feel free to improve them, but we surely don't want German text in the English description field. --- Jura 09:58, 4 December 2018 (UTC)
- I am now moving descriptions from English to German. (details) --Pyfisch (talk) 09:27, 11 December 2018 (UTC)
- (more) --Pyfisch (talk) 11:18, 11 December 2018 (UTC)
- (more) --Pyfisch (talk) 13:37, 11 December 2018 (UTC)
- (last one) I should have fixed most (all). @Jura1: If you have a query that produces more items, please tell me.--Pyfisch (talk) 19:00, 11 December 2018 (UTC)
- Thanks. Seems mostly done. Searching for "Konfession" finds a few more. --- Jura 16:50, 12 December 2018 (UTC)
- "Konfession" done. --Pyfisch (talk) 21:24, 14 December 2018 (UTC)
- Thanks. Seems mostly done. Searching for "Konfession" finds a few more. --- Jura 16:50, 12 December 2018 (UTC)
- Feel free to improve them, but we surely don't want German text in the English description field. --- Jura 09:58, 4 December 2018 (UTC)
- Interesting. I searched with "Pages in this language: English" which only shows a single result. But if I switch the interface language to English I get a whole lot of entities. Thanks for the SPARQL query! I am not sure though if we want to move all these descriptions to German because they are rather long unlike other descriptions that serve to disambigulate between persons of the same name. --Pyfisch (talk) 09:53, 4 December 2018 (UTC)
- This section was archived on a request by: Matěj Suchánek (talk) 14:29, 15 February 2019 (UTC)
Populating P3722 (P3722)
Request date: 30 May 2018, by: Thierry Caro
- Link to discussions justifying the request
- None.
- Task description
Take all instances of subclasses of geographical feature (Q618123). Look for those that have a Commons category (P373) statement and visit the corresponding Commons category. If it includes another category that has Maps of
and then its name as its own name, import this value as P3722 (P3722) to the item. This would be useful to the French Wikipedia, where we now have Q54473574 automatically populated through Template:Geographical links (Q28528875).
- Licence of data to import (if relevant)
- None
- Discussion
- Comment. Hi. Is there still someone here? Thierry Caro (talk) 23:27, 8 November 2018 (UTC)
- @MisterSynergy: This could be for you, couldn't it? Thierry Caro (talk) 19:52, 17 February 2019 (UTC)
- Uff, not sure. Let's talk about one specific example:
- Vernon County (Q496440) has commons:Category:Vernon County, Missouri linked via Commons category (P373)
- commons:Category:Maps of Vernon County, Missouri is a subcategory of commons:Category:Vernon County, Missouri at Commons
- so import "Maps of Vernon County, Missouri" as value for P3722 (P3722) in Vernon County (Q496440)
- Correct? This would require quite some specific code to be written which I do not have yet; however, it does appear doable I think. Do you have an estimation how many statements could be created this way? —MisterSynergy (talk) 20:11, 17 February 2019 (UTC)
- @MisterSynergy: Yes, that's correct. I have no idea how much imports could be made this way but I have the feeling that there could be a lot. My guess would be in the tens of thousands? Thierry Caro (talk) 20:36, 17 February 2019 (UTC)
- I just ran a query here which yields ~30k categories at Commons which are named "Maps of …". However, there is stuff like commons:Category:Maps of 19th-century Europe of which I am not sure whether it qualifies as P3722 value for any item. What do you think? Do we have other "special cases"? —MisterSynergy (talk) 20:41, 17 February 2019 (UTC)
- @MisterSynergy: No, I have nothing else in mind that could create problems. I think having things like commons:Category:Maps of 19th-century Europe as a value is OK. Plus I believe that most of the time, for these special cases, the item – here an item for
19th-century Europe
– won't exist whatever. But if the item does exist, then fine. You may add the relevant statement through P3722 (P3722). Thierry Caro (talk) 02:27, 18 February 2019 (UTC)- I found some time to write the code for such an import, and sample diffs are these: [3][4][5]. Looks good so far. If you are fine with it, I try to get a bot job approved for this import, to perform it with User:MsynBot. I’d iterate over all ~30k "Maps of" categories from my previous comment, and my guess would be that in around 50% of the cases there could be an import. —MisterSynergy (talk) 19:47, 22 February 2019 (UTC)
- @MisterSynergy: Everything is fine, as far as I'm concerned. Thank you. Thierry Caro (talk) 21:00, 22 February 2019 (UTC)
- This is done now, we went from initially 167 P3722 statements up to almost 13.000 now. Quite a lot of the ~30.000 input categories do not fit anywhere right now, either due to their structure which is missing a Wikidata item (such as for example commons:Category:Maps of weather and climate of Sri Lanka, commons:Category:Maps of the world before Columbus, commons:Category:Maps of borders of Sweden, and many others), or because of the single value constraint on P3722 which does not permit to add historical categories (such as commons:Category:Maps of 17th-century Europe). —MisterSynergy (
talk) 15:02, 1 March 2019 (UTC)
- @MisterSynergy: OK. Thank you very much for your dedicated work. There are now almost a thousand active pages using this property on the French Wikipedia, as you may see here. This is great. Thierry Caro (talk) 23:59, 3 March 2019 (UTC)
- @MisterSynergy: If you want to try to get more results for the sake of it, you may try to look for the
Maps of
category not in the main category of the given item anymore but within itsGeography of
subcategory if it exists. For example, go from Martinique (Q17054) to Category:Martinique, then from there to Category:Geography of Martinique and then get to Category:Maps of Martinique eventually. This should let us reach a few hundreds more. But then again I'm already fine with what you've done! Thanks. Thierry Caro (talk) 00:16, 4 March 2019 (UTC)
- @MisterSynergy: If you want to try to get more results for the sake of it, you may try to look for the
- @MisterSynergy: OK. Thank you very much for your dedicated work. There are now almost a thousand active pages using this property on the French Wikipedia, as you may see here. This is great. Thierry Caro (talk) 23:59, 3 March 2019 (UTC)
- This is done now, we went from initially 167 P3722 statements up to almost 13.000 now. Quite a lot of the ~30.000 input categories do not fit anywhere right now, either due to their structure which is missing a Wikidata item (such as for example commons:Category:Maps of weather and climate of Sri Lanka, commons:Category:Maps of the world before Columbus, commons:Category:Maps of borders of Sweden, and many others), or because of the single value constraint on P3722 which does not permit to add historical categories (such as commons:Category:Maps of 17th-century Europe). —MisterSynergy (
talk) 15:02, 1 March 2019 (UTC)
- @MisterSynergy: Everything is fine, as far as I'm concerned. Thank you. Thierry Caro (talk) 21:00, 22 February 2019 (UTC)
- I found some time to write the code for such an import, and sample diffs are these: [3][4][5]. Looks good so far. If you are fine with it, I try to get a bot job approved for this import, to perform it with User:MsynBot. I’d iterate over all ~30k "Maps of" categories from my previous comment, and my guess would be that in around 50% of the cases there could be an import. —MisterSynergy (talk) 19:47, 22 February 2019 (UTC)
- @MisterSynergy: No, I have nothing else in mind that could create problems. I think having things like commons:Category:Maps of 19th-century Europe as a value is OK. Plus I believe that most of the time, for these special cases, the item – here an item for
- I just ran a query here which yields ~30k categories at Commons which are named "Maps of …". However, there is stuff like commons:Category:Maps of 19th-century Europe of which I am not sure whether it qualifies as P3722 value for any item. What do you think? Do we have other "special cases"? —MisterSynergy (talk) 20:41, 17 February 2019 (UTC)
- @MisterSynergy: Yes, that's correct. I have no idea how much imports could be made this way but I have the feeling that there could be a lot. My guess would be in the tens of thousands? Thierry Caro (talk) 20:36, 17 February 2019 (UTC)
- Uff, not sure. Let's talk about one specific example:
- @MisterSynergy: This could be for you, couldn't it? Thierry Caro (talk) 19:52, 17 February 2019 (UTC)
- Request process
Task completed —MisterSynergy (talk) 15:02, 1 March 2019 (UTC)
- This section was archived on a request by: MisterSynergy (talk) 15:02, 1 March 2019 (UTC)
Add annual country level unemployment rate (P1198)
It would be interesting to have annual data for each country (1 value per year for country items). I'm not sure what are the most suitable sources for each country.
When discussing a query with CalvinBall, I noticed that Q30#P1198 currently only has one value (for 2013). --- Jura 12:34, 26 October 2018 (UTC)
- Hi Jura, we could use the WDBot to do this job. The source could be World Bank Data - here an example for the USA: https://data.worldbank.org/indicator/SL.UEM.TOTL.ZS?locations=US. The World Bank uses ILO estimates, which have the following nice properties (check the Details button for the indicator on the WB's page):
- Statistical Concept and Methodology: [...] The standard definition of unemployed persons is those individuals without work, seeking work in a recent past period, and currently available for work, including people who have lost their jobs or who have voluntarily left work. Persons who did not look for work but have an arrangements for a future job are also counted as unemployed. Some unemployment is unavoidable. At any time some workers are temporarily unemployed between jobs as employers look for the right workers and workers search for better jobs. It is the labour force or the economically active portion of the population that serves as the base for this indicator, not the total population. The series is part of the ILO estimates and is harmonized to ensure comparability across countries and over time by accounting for differences in data source, scope of coverage, methodology, and other country-specific factors. The estimates are based mainly on nationally representative labor force surveys, with other sources (population censuses and nationally reported estimates) used only when no survey data are available..
- If this is fine for you I would make a request for bot permission (the script is already available). Datawiki30 (talk) 15:50, 26 October 2018 (UTC)
- Sounds good. Maybe the qualifier criterion used (P1013) could be used with an item that describes the applied methodology. Eventually, we might have numbers with different methodologies for the same year. --- Jura 16:04, 26 October 2018 (UTC)
- Hi Jura and thank you for your feedback. Do we really need the additional qualifier? Similar to the GDP I would just use the "stated in" = World Bank database and "reference URL" = https://data.worldbank.org/indicator/SL.UEM.TOTL.ZS?locations=XX where XX is the ISO code of the country. My opinion is that qualifier trying to explain the data are too short to describe the method behind the data... For new methods I would just suggest to propose a new property (like there are different properties for total, male and female population); Cheers! Datawiki30 (talk) 16:36, 26 October 2018 (UTC)
- I think it's useful. The value would just be a (new) specific item. There is no need for its label or description to include the full text. I think it's an advantage as the above numbers may be useful for cross-country comparison, but some users might be looking for just one country and expect the methodology preferred in the country. For your bot it might be possible to do all in one edit, so the additional work would be marginal. --- Jura 16:44, 26 October 2018 (UTC)
- Thank you Jura for your comment. There are no technical obstacles - the bot can handle this. I suppose that you mean, that we could have structurally different values in the same property - for example the value from ILO could for country A could be for example 5% where for the same years and country the value from Eurostat could be 3 %. Is this the case? Datawiki30 (talk) 21:29, 26 October 2018 (UTC)
- Yes. I had in mind mainly national agencies that might have 4% instead of 10% (by whatever method), but it's the same issue. --- Jura 16:05, 2 November 2018 (UTC)
- @Jura1: OK. Could you please take a look here? After a discussion in the project chat and here I think that it would be the best to import only the most actual data (for example for 2017). Otherwise we could have problems with the loading time of the countries pages. I would be glad to see your comment there. Cheers! Datawiki30 (talk) 19:18, 12 November 2018 (UTC)
- This isn't really helpful to look at the evolution. I think it could easily hold annual data for the last 50 years. There are several other properties that have annual data. --- Jura 04:24, 13 November 2018 (UTC)
trainer-stations
Request date: 12 November 2018, by: Fundriver
- Task description
Is it possible to harvest the trainer-data for coach of sports team (P6087) out of the german Wikipedia out of different infoboxes? It should be pretty similar to the harvesting for member of sports team (P54) and could be done with the same syntax for different sports, because the infoboxes are similar for the different sports in the german Wikipedia (expect for ice hockey): You could use trainer_tabelle in Template:Infobox Rugby Union biography (Q14373909), Template:Infobox football biography (Q5616966), Template:Infobox basketball biography (Q5831659) and Template:Infobox floorball player (Q20963207) with the same technic. You probably just should pay attention to don't import data, that isn't totally clear. So sometimes you have a "(Co-Tr.)", "U-21" or "U21" in addition to the Wikilink that need manual oversight. But per example a "(Co-Tr)" you could use to refine a statement, if this is possible. Fundriver (talk) 09:52, 12 November 2018 (UTC)
- Licence of data to import (if relevant)
- Discussion
- Request process
BLKÖ (one time data import)
Most pages in https://de.wikisource.org/wiki/Kategorie:BLK%C3%96 (27209 pages) seem to lack items (http://petscan.wmflabs.org/?psid=6382466 , currently 26641 pages).
I think it would be worth creating them as well as an item for the person subject of the article if it can't be matched with one of the exisiting items. --- Jura 07:43, 8 November 2018 (UTC)
Proposal
To get this started I propose this structure for articles. It also mentions from which source each statement is imported. As I see it besides the structure for articles the structure for volumes and person subjects with imported data also needs to be decided. Additionally described by source (P1343) should probably be added to new and existing person subjects. --Pyfisch (talk) 22:29, 11 December 2018 (UTC)
Article
- label in German: title without BLKÖ: prefix but with suffix (BLKÖ) added
- example BLKÖ:Boni, Giannantonio becomes Boni, Giannantonio (BLKÖ)
- this follows the example of the Allgemeine Deutsche Biographie
- description in German: Artikel im Biographischen Lexikon des Kaiserthums Oesterreich
- description in English: entry in the Biographisches Lexikon des Kaiserthums Oesterreich
- instance of (P31)biographical article (Q19389637)
- part of (P361)Biographisches Lexikon des Kaiserthums Oesterreich (Q665807)
- title (P1476)title without BLKÖ: prefix
- (main subject (P921)matching item)
- the item from <WP> if it is set
- the item from <WS> if it is set
- the unique item with the same <GND>
- inconsistencies will be reported
- (follows (P155)imported from <vorher>)
- (followed by (P156)imported from <nachher>)
- published in (P1433)derived from <Band> and <Seite>
- Items for different volumes need to be created with correct statements
- volume (P478)imported from <Band>
- page(s) (P304)imported from <Seite> (the template only contains the first page but some articles span multiple pages)
- first line (P1922)imported from WikiSource needs
- interwiki-link to article
- with badge not proofread (Q20748091) if <BS> equals unkorrigiert
- with badge proofread (Q20748092) if <BS> equals korrigiert
- with badge validated (Q20748093) if <BS> equals fertig
- otherwise without badge
- do we need publication date (P577) if this information is stored by the item in published in (P1433)? (this also applies to volume (P478) but it feels different, so probably add it to the items) --Pyfisch (talk) 22:29, 11 December 2018 (UTC)
- Thanks for looking into this. There is an outline for a similar one at Wikidata:WikiProject DNB. The ones that are probably most important, are the English language description (to avoid confusion with items about person), P31, published in (P1433), and if possible, main subject (P921). If you can gather some data from the articles, maybe you could create P921 for any missing ones. You could set English label the same as the one suggested above. --- Jura 16:49, 12 December 2018 (UTC)
- I can try to gather some data with regular expressions. But I am certain that some manual work will be needed. Then I can create items for the missing persons. Do you know how well tools like Mix'n'Match or OpenRefine work for this? --Pyfisch (talk) 19:02, 12 December 2018 (UTC)
- I think it should be possible with Mix'n'Match (ID would be QID of the WS page item), but last time I tried, I think Magnus ended up banging his head against a wall ;) . If the bulk can be matched and the remaining ones can be created with DOB/DOD, it shouldn't be much of an issue. --- Jura 19:24, 12 December 2018 (UTC)
- I can try to gather some data with regular expressions. But I am certain that some manual work will be needed. Then I can create items for the missing persons. Do you know how well tools like Mix'n'Match or OpenRefine work for this? --Pyfisch (talk) 19:02, 12 December 2018 (UTC)
- What is the correct claim for pages like wikisource::de::BLKÖ:Alphabetisches_Namen-Register that are not about a person? instance of (P31)biographical article (Q19389637) does not fit. Maybe just instance of (P31)article (Q191067)? --Pyfisch (talk) 19:02, 12 December 2018 (UTC)
- At Help:Import NBD from enwikisource/lists/other pages, I used "list" for most of them. Not sure if that is ideal. --- Jura 19:24, 12 December 2018 (UTC)
- I've made a preliminary data export. It contains all BLKÖ articles with GND, Bearbeitungsstand etc. The articles are linked based on the stated GND, Wikipedia and Wikisource articles, if there was a conflict multiple Q-numbers are given. I also searched for items linked to the article and unfortuanly found many that describe the person instead the of the text (they will need to be split). The last four columns state the date/place of birth/death from the text. The dates vary in accuracy:
- year-month-day, year-month, only year
- ~ before date describes imprecise dates
- > before describes dates stated as "nach 1804"
- A before dates describes "Anfang/erste Tage" start of
- E before dates describes "Ende/letzte Tage" end of
- M before dates describes "Mitte" middle of
- ? BLKÖ knows the person was dead but does not know when he/she died
The places will need to be manually matched to Q-items. The first column contains some metadata about the kind of page. There are:
- empty: Person
- L: Liste
- F: Family, Wappen, Genealogie
- R: Cross Reference
- P: Prelude
- H: note about names and alternate spellings
- N: corrections, Nachträge
Each group should get a distinct is-a property. @Jura1: Do you like it? This is just for viewing, a later version will be editable to make manual changes before the import. --Pyfisch (talk) 22:14, 18 December 2018 (UTC)
- I like the approach. BTW, there is Help:Dates that attempts to summarize how to add incomplete dates. --- Jura 14:05, 20 December 2018 (UTC)
- editable data export. Updated the exported data. The sheet "articles" is already cleaned up. But I need help to match the ~4000 place names in the sheet "places" to Wikidata Q-Items. --Pyfisch (talk) 16:07, 22 December 2018 (UTC)
- @Pyfisch: maybe OpenRefine could be useful to do this matching. − Pintoch (talk) 13:02, 16 January 2019 (UTC)
- editable data export. Updated the exported data. The sheet "articles" is already cleaned up. But I need help to match the ~4000 place names in the sheet "places" to Wikidata Q-Items. --Pyfisch (talk) 16:07, 22 December 2018 (UTC)
- @Pyfisch: thanks a lot for your proposal! Are there any plans to realize this? --M2k~dewiki (talk) 07:16, 10 July 2019 (UTC)
- @M2k~dewiki: Yes, the data is already prepared for the import, but I have not gotten around to writing an import script, getting approval and running the script. --Pyfisch (talk) 09:07, 11 July 2019 (UTC)
- You could do the upload with QuickStatements --- Jura 12:24, 19 July 2019 (UTC)
- If not done yet, I will try to import them. --- Jura 21:19, 26 March 2020 (UTC)
Hello @Jura1:
I started to create the wikidata objects for Kategorie:BLKÖ:Band 1 to Kategorie:BLKÖ:Band 12 (from overall 60) a few days ago with Petscan (example), since I did not want to create them manually anymore for every newly created article in the German wikipedia, which references to an entry from BLKÖ (and this proposal from 2018 has been already archived in January and the user who wrote the initial proposal did not reply anymore Topic:Vg5y62e08pcorztk, also the documents above ("data export") already have been deleted on Google Docs).
I also tried to use Harvest Tools to add the references (example) harvesting from Vorlage:BLKÖ, which only worked in a few cases.
How would you do the cross-referencing of the two items? Which tools would you use? I also checked Wikidata:WikiProject DNB, but I did not find any comment, which tools or techniques have been used. Besides from Quickstatements, HarvestTools, PetScan, I sometimes use external (PERL) scripts, in order to harvest information (e.g. using WWW::Mechanize, responses from https://query.wikidata.org/sparql, ...), to convert/merge data, or to prepare statements for Quickstatements, for example. --M2k~dewiki (talk) 17:21, 27 March 2020 (UTC)
- Yeah, I should have follow through with this earlier. I think PetScan should do. You could also create a QuickStatements for new items.
- MxM would probably be ideal, but Magnus would need to do a few adjustment for it to work. If you work with Openrefine, that might be an alternative. For entries where we can get dates, it might fairly straightforward to just create them and merge a few duplicates. --- Jura 00:23, 28 March 2020 (UTC)
- Hi @M2k~dewiki:, thanks for starting again to bring some attention on this interesting and important project on deWikisource linking with Wikidata. I'm already maintaining another big project on deWikisource linking to Wikidata - Die Gartenlaube. There we have now more than 13,000 fully qualified bibliographic items on Wikidata. For this purpose i have written a python script to parse the categories and extract the bibliographic information from the infobox on each article page and create import files for QuickStatements.
BLKÖ is already on my todo-list, unfortunately time is running... The above presented data model looks very good to me, i only have some suggestions (from a librarian's perspective): ;-) we shouldn't use in bibliographic context
therefor
- published in (P1433)Biographisches Lexikon des Kaiserthums Oesterreich (Q665807) with additonal qualifiers for volume (P478) and page(s) (P304) (but that's not obligate, because these information exists as main statements too)
If i have some time within the next few days i could point my script to BLKÖ and make some imports as example for further discussions. --Mfchris84 (talk) 21:59, 28 March 2020 (UTC) (Add a section on German WikiSource to inform the community about this ongoing discussion here: BLKÖ
- Yes, P1433 does look better. Feel free to move ahead as you prefer. If you create them with QuickStatements, it would have the advantage that everything gets done in one edit. BTW, I just came across an experiment I did with MxM a while ago [6]. I suppose we could get that to work. --- Jura 22:07, 28 March 2020 (UTC)
currently there exist about 8.200 objects for BLKÖ (approx. volume 1 to 16 out of 60), where about 300 have a cross-reference yet (User:M2k~dewiki/Tools#Project_BLKÖ):
Today I wrote a program which can cross-reference objects based on the GND.
Examples:
- d:Q88549749 and d:Q70516
- d:Q88549769 and d:Q114618
- d:Q88549791 and d:Q1525727 --M2k~dewiki (talk) 22:30, 28 March 2020 (UTC)
- Sounds good. I was wondering if we would want to import first line (P1922). Could make things easier. --- Jura 22:39, 28 March 2020 (UTC)
- Hey @Jura1:, first line (P1922) that's a great idea! i made an example here: Schmid, Anton (Musikschriftsteller) (BLKÖ) (Q27134764), so we had also to add copyright status. i add both statements in the model above. --Mfchris84 (talk) 22:53, 28 March 2020 (UTC)
@M2k~dewiki, Jura1:, thanks for these queries, they are very useful. I made some edits on Anschütz, Heinrich (BLKÖ) (Q88549749) to have a "fully qualified" item for biographic encylopedial article.
Do we have a consensus about the structure of the labels? I really would prefer to add BLKÖ in the label to have an easy difference between the biographical item and the BIBLIOgraphical item. This could be especially for automatically reconciliation processes like in OpenRefine very useful to avoid that people starting adding biographical identifiers to the enyclopedical items from BLKÖ instead to the biographical items. What are your opinions on this question? --Mfchris84 (talk) 22:47, 28 March 2020 (UTC)
Some other useful queries for updating the existing items:
@Mfchris84, Jura1: the label structure d:Q88549769 would be similar to the ADB entries, for example d:Q27568759. --M2k~dewiki (talk) 22:52, 28 March 2020 (UTC)
- in a certain way this looks good for me, i have my doubt that don't have the encylopedia abbreviation in the label directly some bot scripts or reonciliation process with less attention will match these bibliographic items to some biographical facts which should be stored in the biographical item on Wikidata - but that are things we could fetch with ShapExpressions later on! --Mfchris84 (talk) 22:59, 28 March 2020 (UTC)
@Mfchris84: regarding d:Q88549749: is your script also able to do all or some of the other steps, like creating the entries (currently done with petscan, might take several days or weeks) or adding the cross-references ? --M2k~dewiki (talk) 22:57, 28 March 2020 (UTC)
- @M2k~dewiki: of course my python script is also able to do cross-referencing based on the given wikipedia-links or gnd-ids in the infoboxes on wikisource page. e.g that's also the way how i add main subject statement for Gartenlaube articles, because Wikisource-editors add main subjects on WikiSource-Pages with Wikipedia-Links. Unfortunately i need some time to point my script against BLKÖ. But then i can do both, create new items and update all the existing items. --Mfchris84 (talk) 23:02, 28 March 2020 (UTC)
- Personally, I tend to agree with Mfchris84, especially as I had to cleanup several 10,000 of DNB entries that kept getting mixed up. (i.e. I prefer the earlier version of ADB [7].) --- Jura 23:01, 28 March 2020 (UTC)
- i would also prefer this way. with a suffix in parenthesis it also don't look that bad as with these namespace-like prefix as it as structured in Wikisource. --Mfchris84 (talk) 23:03, 28 March 2020 (UTC)
- What shall we do with items like Biographisches Lexikon des Kaiserthums Oesterreich volume 26 (Q66809342) (for a specific volume)? I suppose we could link them later with a qualifier from volume (P478). --- Jura 23:16, 28 March 2020 (UTC)
@M2k~dewiki, Jura1: - ok, four hours of coding, i think i have some first approach create some good items for BKLÖ. My script gitlab.com/blkoe is now able to create items like Majláth von Székhély, Joseph (II.) Graf (BLKÖ) (Q88911969) by QuickStatements. Improvement i see is needed for follows/followed by. But that's would be in a second round even better, because in the first round for all items without a wikidata-item these links won't be set neither. My script also added on biographical item the described by source (P1343) Q1163609#P1343 with qualifiers and references. What do you think about this item? I think, we could go forward this way; in a "second round" all items could get improvement like
- missing volume/page statements
- missing main subject (and its cross-reference on the biographical item - but maybe M2k~dewiki will do this which his/her script?)
- adding follows/followed by (which needs that all items are created)
--Mfchris84 (talk) 01:52, 29 March 2020 (UTC)
@Mfchris84: looks good to me, please go ahead. Thanks a lot! --M2k~dewiki (talk) 02:05, 29 March 2020 (UTC)
- @M2k~dewiki, Jura1: Thanks, i will go ahead with my approach. One point we have to discuss, are how shall we deal with the huge amount on "cross-references" within the encyclopedia. Like BLKÖ:Lubomirski, Georg Fürst (Verweis) Lubomirski, Georg Fürst (Verweis) (BLKÖ) (Q88898002) which only stated a different name for the encyclopedic article there BLKÖ:Lubomirski, Georg Fürst Lubomirski, Georg Fürst (BLKÖ) (Q88893826). We should prevent to create items like Lubomirski, Georg Fürst (Verweis) (BLKÖ) (Q88898002) - i know this happend to the petscan import, which can't deal with such constraints. What would be a good model for such a "Verweis"?
- I modified Lubomirski, Georg Fürst (Verweis) (BLKÖ) (Q88898002) in the following points:
- * instance of (P31)cross-reference (Q1302249)
said to be the same as (P460)Lubomirski, Georg Fürst (BLKÖ) (Q88893826) Link to the biographical article. - * Adding Volume and Page information, and also follows/followed by links like it is given in the encyclopedia.
- * description change to "cross-reference to an entry in the [BLKÖ]"
- --06:27, 29 March 2020 (UTC)
- For cross-references, main subject could be used to point to the other article. P460 isn't generally used as a qualifier. HarvestTemplates could be used to set follows/followed by, if there is interest. The main advantage of cross-reference items is that we wont miss pages. It's up to you how much detail you want to add into them, if any. --- Jura 06:54, 29 March 2020 (UTC)
- Good point.
During my first QS-BatchJob, a problem occurs adding described by source (P1343) with the "LAST" term. So there a self-referencing statements now.
SELECT * WHERE {
?item p:P1343 ?descStmt.
?descStmt ps:P1343 wd:Q665807;
pq:P805 ?item.
?blkoArticle wdt:P1433 wd:Q665807;
wdt:P921 ?item.
}
I have cleaned those self-references up and i will split described by source (P1343) also into a second round when i have the Q-ID for the article. --Mfchris84 (talk) 06:57, 29 March 2020 (UTC)
- Looks good. I went through a few entries. --- Jura 08:09, 29 March 2020 (UTC)
- I did a few stats at Help:Import BLKÖ from wikisource. --- Jura 09:03, 29 March 2020 (UTC)
- Just a minor thing: if you want to include vol/page in the English description, please use, e.g. "(vol. 16, p. 371)" instead of "(Bd. 16, S. 371)". --- Jura 14:45, 29 March 2020 (UTC)
@Mfchris84: it seems that the english and german descriptions are mixed up ("(vol. 16, p. 357)" vs. " (Bd. 16, S. 357)"). Examples: d:Q88936832, d:Q67389376. --M2k~dewiki (talk) 00:51, 30 March 2020 (UTC)
- @M2k~dewiki: correct. if have already changed my script and starting and update process as soon as possible. --Mfchris84 (talk) 05:19, 30 March 2020 (UTC)
@Zabia: this project might be also for your interest, for example de:Benutzer:M2k~dewiki/Test, e.g. to check the completeness (do all wikisource articles link to an existing article in the german language wikipedia?) and correctness (does the article really exist/has the article been moved to another lemma?) of links in the Wikisource articles to the german language articles. (also see Help:Import BLKÖ from wikisource). --M2k~dewiki (talk) 11:52, 30 March 2020 (UTC)
- For example, in this wikisource article the lemma in the german wikipedia has been changed, but the change has not been done in the wikisource article. --M2k~dewiki (talk) 12:00, 30 March 2020 (UTC)
- I added some of the mappings to Help:Import BLKÖ from wikisource. BTW, I'm trying to do word counts and character counts for NBD. Depending on how it goes, I might propose to add this here too. For the later count, this property proposal could help me, but it's somewhat stalled and I don't see the suggeested alternative working. --- Jura 12:26, 30 March 2020 (UTC)
- For main subject (P921), Help:Add main subject with Mix-n-Match might help. If we want to use it for this, it would be good if Wikidata:Property_proposal/MxM_xref was ready. --- Jura 12:46, 2 April 2020 (UTC)
- BTW, if you want to speed things up, maybe you want to skip imported from Wikimedia project (P143) and Wikimedia import URL (P4656) on anything but main subject (P921). Even for P921, P143 could be sufficient. --- Jura 13:10, 2 April 2020 (UTC)
- Sorry @M2k~dewiki My english and also my technical knowledge is not as well enough to understand all. And actually I have enough to do with transkribing Codex Manesse. Zabia (talk) 05:56, 31 March 2020 (UTC)
- Seems to be advancing fine, despite some hickups with Quickstatement yesterday. I expanded the help page a bit. Some random points:
- Completed volume/page for all existing items. (9000)
- Added/fixed a few P31 for cross-reference (Q1302249)
- For vol. 1 & 2, I added the scan available at Commons with the relevant file page (Hopefully the thumbnails work one day).
- Wikisource has an external link to scans. If there is interest, these could be added with full work available at URL (P953).
- Should a "proofread"-badge be set on the sitelink?
- --- Jura 21:44, 4 April 2020 (UTC)
- This section was archived on a request by: continued at Help:Import BLKÖ from wikisource --- Jura 06:44, 23 April 2020 (UTC)