Wikidata talk:WikiProject 20th Century Press Archives/Tools & tasks

Best strategy to add new items derived from PM20?

edit
  1. Producing all available properties in one script,
    (current implementation for companies with type/official-name/incepted/abandoned/gnd-id in companies_missing_in_wikidata.rq (also restricts result to entries checked with Mix-and-match) and script create_missing_wikidata.pl)
  2. or alternatively, adding items with a
    1. very basic query/script covering only label(s), perhaps description(s), type and pm20 id,
    2. complemented by many queries/scripts, which will add a single property for all linked items (already existing or created from PM20) lacking that property.
    (example implementation for organization type (P31) in missing_class_via_pm20.rq)

The second strategy requires more queries/scripts, more invocations and better documentation. However, it allows for progressive improvement (e.g., incrementally mapping more and more professions). Additionally, it can be combined with manual addition of items (e.g., via mix-n-match), where also very few properties are populated automatically.

-- Jneubert (talk) 11:42, 28 April 2019 (UTC)Reply

Implementation: One script, multiple queries, multiple invocations for enhancement
One script (add_missing_wikidata.pl), which can be called in 'create' or 'enhance' mode. For item creation, it uses a query such as persons_missing_in_wikidata.rq. For enhancing Wikidata one property at a time, the script uses either a generic query (property_missing_in_wikidata.rq - works, if the source property can be addressed directly or via a property path), or a configuratble property-specific query. Jneubert (talk) 10:04, 6 May 2019 (UTC)Reply
Examples
Script invocation with add_missing_wikidata.pl pm20_pe create and running the lines below in QS added Hagenbeck (Q63527827):
 CREATE
 LAST|Lde|"Hagenbeck"
 LAST|Len|"Hagenbeck"
 LAST|Dde|"Hamburger Zoodirektoren-Familie"
 LAST|Den|"family"
 LAST|P4293|"pe/006937"
 LAST|P31|Q8436|S248|Q36948990|S4293|"pe/006937"|S1810|"Hagenbeck <Familie>"|S813|+2019-05-06T00:00:00Z/11
 LAST|P227|"118700537"|S248|Q36948990|S4293|"pe/006937"|S1810|"Hagenbeck <Familie>"|S813|+2019-05-06T00:00:00Z/11
Script invocation with add_missing_wikidata.pl pm20_pe enhance P227 and running the line below in QS adds GND ID to Georges Ibrahim Abdallah (Q3102918):
 Q3102918|P227|"118848658"|S248|Q36948990|S4293|"pe/000021"|S1810|"Abdallah, Georges Ibrahim"|S813|+2019-05-06T00:00:00Z/11
Jneubert (talk) 10:04, 6 May 2019 (UTC)Reply

Data corrections

edit

Errors which should be fixed in the upstream IFIS database.

PM20 companies en

edit

PM20 companies de

edit

PM20 companies fr

edit

Wrong GND in IFIS

edit

Wrong predecessor / successor relation

edit

Multiple entries per item

edit
Return to the project page "WikiProject 20th Century Press Archives/Tools & tasks".