User:Vahurzpu/Projects

I generally work in Jupyter notebooks, and keep intermediate work. If any of these projects sound interesting, I can likely send relevant Python code.

Sporadic editing edit

Adding interwiki links to Commons categories,
Linking Wikidata and OpenStreetMap (Q936), especially when it comes to schools in the US. This is mostly happening on the OSM side (I'm not tagging things with OpenStreetMap relation ID (P402)).
Moving authority control local to English Wikipedia or Wikimedia Commons into Wikidata, reporting errors where necessary.
Converting manual redlink lists for English Wikipedia's Women in Red project to use Wikidata by creating items for the redlinked women and adding appropriate described by source (P1343) entries.

Systematic editing edit

Most of these describe activities that are at least partially automated, but a few were done by hand really thoroughly.

Completed edit

Updating the entries for agencies of US government states, adding appropriate instance of (P31), applies to jurisdiction (P1001), official website (P856), and X username (P2002) values
Cleaning up the publication history of several academics who had particularly messed-up profiles
Importing arXiv author ID (P4594) by matching with ORCID iD (P496)
Importing identifiers for writers at particular publications, such as The Atlantic author ID (P6791), Harper's author ID (P6784), and LARB contributor ID (P5639) by searching English Wikipedia for appropriately-formatted links
Extracting climate data for global warming potential (P2565) on behalf of User:Leyo
Cleaning hundreds of invalid Google Scholar author ID (P1960) values from the (at-the-moment) very screwed up entry for ESSEC Business School (Q273642) and attempting to match them to actual humans
Adding Academic Tree ID (P2381) information for people who had a Wikipedia link in their biography on the site
Adding some basic district information to everyone in the Virginia House of Delegates (Q320275) at the time
Import Invaluable.com person ID (P4927) based on links to Invaluable in Commons file descriptions, then winding through creator pages and categories to get to artists.
Linking Commons categories about artists that matched on English label, year of birth, and year of death
Import Photographers’ Identities Catalog ID (P2750) matching on VIAF ID (P214) (the database is on GitHub)

Partially completed edit

Adding appropriate category contains (P4224) statements to the subcategories of Category:Musicians by band (Q7411243) on English Wikipedia (Q328), then adding member of (P463) claims on musicians based on their category membership.
Merge duplicates in The Peerage (Q21401824) based on in-article links in English Wikipedia (Q328)
Do something similar with Find a Grave (Q63056), though adding IDs instead of merging duplicates.
Do something similar on Wikimedia Commons (Q565), though with some additional logic to wind through categories
Adding Academic Tree ID (P2381) based on matching with various identifiers linked in the description field (exposed through the Mix'n'Match catalog)
Add Mathematics Genealogy Project ID (P549) based on external links in English Wikipedia (Q328) and German Wikipedia (Q48183)

Started edit

Importing the researchers and the citation graph for Janelia Research Campus (Q1319362)
Importing official colors from a bunch of Virginia high schools listed in an official Virginia High School League (Q7934309) directory
Import all episodes of Radiolab (Q2856080), and link in guests where possible/practical
Parse The Political Graveyard (Q7757656) (else get the author to send me a database) so that I can get Political Graveyard politician ID (P8462) into Mix'n'Match
Adding point in time (P585) and award rationale (P6208) qualifiers for ACM Fellow (Q18748039)
Inventories of American Painting and Sculpture control number (P4814) is a very underused property, and given all the links in Wikipedia it could be used way more. However, ensuring precise matches is trickier here than usual.

Not started edit

Importing data on nuclear tests from official records, and matching to the Wolfram Language codes in Mix-n-Match
Giving every public school district in the US an official website, through a mix of Common Core of Data (Q25834722) and manual internet search
Scraping the alumni sections of English Wikipedia pages of high schools in Virginia (and possibly beyond, if I can get this to work)
Finding a way to extract book titles from Wikipedia articles to make it possible to auto-match with VIAF's "Works" section
Scraping journal article citations from either the Wikipedia pages of academics or else finding pages listing publications on their academic webpages and matching things that way
Somehow gathering a database of official email addresses of academics (probably offwiki to prevent spam) and linking items for corresponding authors.
Get Our Campaigns (Q97064980) data into Mix'n'Match
Look for cases where a Library of Congress authority record links to Wikipedia, and check if the Wikidata item has that LCCN on it.
Propose a new property for the Royal Society Collections, and then figure out how to mass-import them into articles.
US state legislatures pretty much all have giant tables with politican links and their districts. So does Vote Smart (Q7249368). It should be possible to turn those wikitext tables into Wikidata ones, and then use this to mass-link Vote Smart candidate ID (P3344), which currently has poor coverage. Also, when a state has items for individual districts, those can be added as qualifiers.

Tools edit

Created edit

This form for quickly converting a FamilySearch record into

Should be created edit

The tool described in phab:T249687
(If this is possible): A small SPARQL federated service that would bridge SPARQL queries and PagePile lists