I generally work in Jupyter notebooks, and keep intermediate work. If any of these projects sound interesting, I can likely send relevant Python code.
- Adding interwiki links to Commons categories,
- Linking Wikidata and OpenStreetMap (Q936), especially when it comes to schools in the US. This is mostly happening on the OSM side (I'm not tagging things with OpenStreetMap relation ID (P402)).
- Moving authority control local to English Wikipedia or Wikimedia Commons into Wikidata, reporting errors where necessary.
- Converting manual redlink lists for English Wikipedia's Women in Red project to use Wikidata by creating items for the redlinked women and adding appropriate described by source (P1343) entries.
Most of these describe activities that are at least partially automated, but a few were done by hand really thoroughly.
- Updating the entries for agencies of US government states, adding appropriate instance of (P31), applies to jurisdiction (P1001), official website (P856), and Twitter username (P2002) values
- Cleaning up the publication history of several academics who had particularly messed-up profiles
- Importing arXiv author ID (P4594) by matching with ORCID iD (P496)
- Importing identifiers for writers at particular publications, such as The Atlantic author ID (P6791), Harper's author ID (P6784), and LARB contributor ID (P5639) by searching English Wikipedia for appropriately-formatted links
- Extracting climate data for global warming potential (P2565) on behalf of User:Leyo
- Cleaning hundreds of invalid Google Scholar author ID (P1960) values from the (at-the-moment) very screwed up entry for ESSEC Business School (Q273642) and attempting to match them to actual humans
- Adding Academic Tree ID (P2381) information for people who had a Wikipedia link in their biography on the site
- Adding some basic district information to everyone in the Virginia House of Delegates (Q320275) at the time
- Import Invaluable.com person ID (P4927) based on links to Invaluable in Commons file descriptions, then winding through creator pages and categories to get to artists.
- Linking Commons categories about artists that matched on English label, year of birth, and year of death
- Import Photographers' Identities Catalog ID (P2750) matching on VIAF ID (P214) (the database is on GitHub)
- Adding appropriate category contains (P4224) statements to the subcategories of Category:Musicians by band (Q7411243) on English Wikipedia (Q328), then adding member of (P463) claims on musicians based on their category membership.
- Merge duplicates in The Peerage (Q21401824) based on in-article links in English Wikipedia (Q328)
- Do something similar with Find a Grave (Q63056), though adding IDs instead of merging duplicates.
- Do something similar on Wikimedia Commons (Q565), though with some additional logic to wind through categories
- Adding Academic Tree ID (P2381) based on matching with various identifiers linked in the description field (exposed through the Mix'n'Match catalog)
- Add Mathematics Genealogy Project ID (P549) based on external links in English Wikipedia (Q328) and German Wikipedia (Q48183)
- Importing the researchers and the citation graph for Janelia Research Campus (Q1319362)
- Importing official colors from a bunch of Virginia high schools listed in an official Virginia High School League (Q7934309) directory
- Import all episodes of Radiolab (Q2856080), and link in guests where possible/practical
- Parse The Political Graveyard (Q7757656) (else get the author to send me a database) so that I can get Political Graveyard politician ID (P8462) into Mix'n'Match
- Adding point in time (P585) and award rationale (P6208) qualifiers for ACM Fellow (Q18748039)
- Inventories of American Painting and Sculpture control number (P4814) is a very underused property, and given all the links in Wikipedia it could be used way more. However, ensuring precise matches is trickier here than usual.
- Importing data on nuclear tests from official records, and matching to the Wolfram Language codes in Mix-n-Match
- Giving every public school district in the US an official website, through a mix of Common Core of Data (Q25834722) and manual internet search
- Scraping the alumni sections of English Wikipedia pages of high schools in Virginia (and possibly beyond, if I can get this to work)
- Finding a way to extract book titles from Wikipedia articles to make it possible to auto-match with VIAF's "Works" section
- Scraping journal article citations from either the Wikipedia pages of academics or else finding pages listing publications on their academic webpages and matching things that way
- Somehow gathering a database of official email addresses of academics (probably offwiki to prevent spam) and linking items for corresponding authors.
- Get Our Campaigns (Q97064980) data into Mix'n'Match
- Look for cases where a Library of Congress authority record links to Wikipedia, and check if the Wikidata item has that LCCN on it.
- Propose a new property for the Royal Society Collections, and then figure out how to mass-import them into articles.
- US state legislatures pretty much all have giant tables with politican links and their districts. So does Vote Smart (Q7249368). It should be possible to turn those wikitext tables into Wikidata ones, and then use this to mass-link Vote Smart ID (P3344), which currently has poor coverage. Also, when a state has items for individual districts, those can be added as qualifiers.
- This form for quickly converting a FamilySearch record into
Should be createdEdit
- The tool described in phab:T249687
- (If this is possible): A small SPARQL federated service that would bridge SPARQL queries and PagePile lists