Wikidata:Bot requests/Italian Wikipedia person data

Italian Wikipedia person data edit

General discussion: Wikidata:Project chat#Italian person data (now archived -> consensus found); original location User_talk:Legobot/properties.js#Italian_person_data
  • Category: w:it:Categoria:BioBot (200+ thousands people, cf. [1])
  • Properties: w:it:Template:Bio#Tabella_completa, to be fetched from template usage (not everything translated to categories)
    • Examples: name and gender (mandatory), surname, place/date/year of birth/death, one out of 552 defined jobs
    • More details will follow, are you also interested in a mapping for the jobs? There's a lot of them and I doubt any other wiki has them in a structured format. --Nemo 19:27, 17 February 2013 (UTC)[reply]
    • I will work on implementing template parsing hopefully by this weekend.
    • Sesso: P21 -> 6581097 if M, 6581072 if F (note that this is used only for grammatical purposes so "intersex" is not used; in non-trivial cases, it may reflect the policy here on Wikidata or not)
        Done
    • LuogoNascita (but LuogoNascitaLink should prevail if available): P19 -> the entry for the item corresponding to the page with that title
        Done
    • LuogoNascitaAlt: same as above, for complex cases with alternatives; maybe a secondary statement for P19? no other property is available
    • NoteNascita: pull sources for the Nascita statements from the ref tags in here.
    • LuogoMorte, LuogoMorteLink and LuogoMorteAlt, NoteMorte: same as above but for P20
        Done
    • Nazionalità: P27 -> linked country
    • NazionalitàNaturalizzato: additional statement to P27
        Done for countries instances of a subclass of state (Q7275) except a few, see list of articles not imported yet and breakdown by their value.
        Info See the map from adjectives to countries. The local information is based on current sources. Except 4 entities to sync, all the values used are compatible with this property. See further discussion.
    • PostNazionalità: this field may contain sources for any of the previous statements (more general ones could also be right after the end of the template or in FineIncipit).
    • FineIncipit: replaces standard occupation etc., maybe add to item description?
    • Immagine: P18 -> image with this name (check if it's on Commons; over 35k usages)
    • For each statement: add as reference the Property:P143 with value Q11920, example cat (update: as discussed at project chat).
    • First name (Nome): P735
        Done where it equals an it.wiki article and hence entity.
    • Last name (Cognome): P734
        Done (same);   Info below on disambiguation pages, transliteration
    • Day and month of birth (GiornoMeseNascita) + Year of birth (AnnoNascita): P569
    • Day and month of death (GiornoMeseMorte) + Year of death (AnnoMorte): P570
      Do not add a date in contrast with a Integrated Authority File (Q36578) statement if available.
        Done in part by Dexbot, dates after 1920.   Info ViscoBot had started but stopped long ago.
        Question I also wrote the code to import dates of birth and death but I'm not running it yet because there is one important question: What is the colander model you use as date of birth and death? in some places Gregorian wasn't common until 1912 so I can't add these dates before 1912 because the bot can't be sure about calender model of these dates  – The preceding unsigned comment was added by Ladsgroup (talk • contribs).
      We're verifying, I'll let you know the final outcome. Past discussions seem to have all agreed on forcing gregorian calendar in the template, with the option to indicate giulian calendar next to it with a warning. --Nemo 13:27, 27 April 2014 (UTC)[reply]
    • Title to be used before name, or after it in some languages other than Italian (Titolo): P511 (about 3k usages)
    • Missing properties:
      • Unrecognized citizenship (peoples without state), e.g. Kurds (Cittadinanza)
      • Free text notes on dates of birth/death (NoteNascita, NoteMorte): some sources could be extracted from here. Example content is very varied but in 55 % of cases contains an URL, that could be imported as source.
    This should be it. --Nemo 08:49, 22 February 2013 (UTC)[reply]
A proposal on sourcing for Wikidata was moved to Wikidata:Project chat#Proposal: preventive control of imported data correctness
As far as edit summaries go, the bot actually does send proper edit summaries, in the format of Bot: Setting [[Property:{pid}|{pid}]] to [[{target_qid}]]; using [[:{lang}:{source}]]; requested by [[User talk:{user}|{user}]], it's just that the software doesn't support them yet. It may be worth putting this run on hold until the software does support custom summaries.
I do believe that at this point, we may need to look how to properly source these claims, since they are no longer "obvious". Maybe that should be a discussion on Project chat? I believe there are legitimate concerns before this request can go forward, as well as code that I need to work on. Legoktm (talk) 01:23, 23 February 2013 (UTC)[reply]
If edit summaries are a problem, we could just use a different username for the bot, like "Italian Wikipedia person data import bot".
What fields are no longer obvious, specifically? Surely place of birth is more "obvious" and less controversial than gender, for instance. I think it makes sense to start only with the "obvious" ones: it seems to me that most worries are about nationalist controversies, so probably those are the only fields to exclude in the first run? Otherwise, sources exist of course, you could pull them at the same time if people feel it can't be done later. --Nemo 08:50, 23 February 2013 (UTC)[reply]
Ping. I have updated the data above, it seems to me that we no longer have anything to wait for? Were the easy parts like gender done already? --Nemo 08:58, 23 August 2013 (UTC)[reply]