User:Daniel Mietchen/365 climate edits

This page assists in documenting my contributions to the 365 climate edits initiative.

Scope edit

  • Start time: January 1, 2023 (Q69306665)      
  • End time: December 31, 2023 (Q69307031)      
  • Tasks:
    • Make at least one climate-related edit per day, anywhere in the Wikimedia ecosystem
    • Document the edits on an ongoing basis
  • Rules
    • For the purpose of this activity, I understand a "day" as the time frame from the earliest to the latest point at which the date in question is valid anywhere on Earth.
    • An "edit" is a change to a Wikimedia wiki that is visible to the public version history of the wiki page in question.
    • Edits that are part of an edit batch are eligible, but if an edit from a given batch has already been selected as the contribution for any given day, then further edits from the same batch are not eligible for future days.
  • Recent changes

Gallery of media files worked on edit

Daily examples of my #365climateedits contributions edit

Below, I am linking a sample edit (including newly created pages) for each day of the year 2023 (Q49622). The list of days is generated via this query, which will probably see some modifications over the year. In many cases, I am using the template {{Q'''}}, which provides links (via the icons) to WD:Reasonator, SQID and Scholia to facilitate further exploration, including of related content.

Most recent edit

Keeping this to three days for now.

Upcoming edit

Nothing for now.

Past edit

Queries edit

German nouns containing the string "klima" but having no sense statements edit

The following query uses these:

  • Properties: item for this sense (P5137)     
    (CONCAT("", "language=",  ENCODE_FOR_URI(LANG(?lemma)), "&q=", ENCODE_FOR_URI(STR(?lemma))) AS ?Url2)
    WHERE {
      ?lexeme dct:language wd:Q188 .
      FILTER NOT EXISTS { ?lexeme ontolex:sense / wdt:P5137 ?item }
          ?lexeme wikibase:lemma ?lemma ;              
           wikibase:lexicalCategory wd:Q1084 .
           FILTER REGEX(LCASE(?lemma), "klima")
    LIMIT 500

German nouns containing the string "klima" but having no pronunciation audio file edit

The missing audio files can be recorded via LinguaLibre's dedicated list.

The following query uses these:

  • Properties: pronunciation audio (P443)     
    (CONCAT("", "language=",  ENCODE_FOR_URI(LANG(?lemma)), "&q=", ENCODE_FOR_URI(STR(?lemma))) AS ?Url2)
    WHERE {
      ?lexeme dct:language wd:Q188 .
      FILTER NOT EXISTS { ?lexeme wdt:P443 ?prununciation_audio. }
          ?lexeme wikibase:lemma ?lemma ;              
           wikibase:lexicalCategory wd:Q1084 .
           FILTER REGEX(LCASE(?lemma), "klima")
    LIMIT 500

Ukrainian nouns containing the string "кліма" but having no sense statements edit

The following query uses these:

  • Properties: item for this sense (P5137)     
    (CONCAT("", "language=",  ENCODE_FOR_URI(LANG(?lemma)), "&q=", ENCODE_FOR_URI(STR(?lemma))) AS ?Url2)
    WHERE {
      ?lexeme dct:language wd:Q8798 .
      FILTER NOT EXISTS { ?lexeme ontolex:sense / wdt:P5137 ?item }
          ?lexeme wikibase:lemma ?lemma ;              
           wikibase:lexicalCategory wd:Q1084 .
           FILTER REGEX(LCASE(?lemma), "кліма")
    LIMIT 500

Common n-grams in titles of works about palaeoclimate reconstructions edit

The following query uses these:

  • Properties: main subject (P921)     , title (P1476)     , KIT Linked Open Numbers ID (P5176)     , numeric value (P1181)     
    # Most frequent n-grams from a random set of 1000 publications on a given topic
    SELECT DISTINCT ?Ngram ?N ?Count ?Length ?Dashes ?Score ?ExamplePub ?ExamplePubTitle
    { # Generating a list of entities to be analyzed
      SELECT ?Publication
          SERVICE bd:sample { ?Publication wdt:P921 wd:Q116146313 . bd:serviceParam bd:sample.limit 1000 }   
    } AS %items 
    { # Preprocessing the titles
      SELECT ?Title ?Publication ?Seeds ?ClearTitleLength
          INCLUDE %items
          ?Publication wdt:P1476 ?Title.
          BIND (REPLACE(STR(?Title),"[\\.:,;\\[\\]\\?()$]","") AS ?ClearTitle) # remove some frequent special characters, including colons and semicolons
          BIND(STRLEN(?ClearTitle) AS ?ClearTitleLength) 
          # Basic processing of the titles
          BIND ("::: ::: ::: ::: ::: ::: ::: ::: " AS ?StartCodon)
          BIND (" ;;; ;;; ;;; ;;; ;;; ;;; ;;; ;;;" AS ?StopCodon)
          BIND (LCASE(CONCAT(?StartCodon , # add start codon of colons to assist with processing of n-grams at beginning of title
                                ?StopCodon)) # add stop codon of semicolons to assist with processing of n-grams at end of title
                         AS ?Seeds )
    } AS %titles 
    { # Generating a list of regexes to look for the NumericValue-th word in a string     
      # Based on$ by Jura1
      SELECT ?Regex1 ?Regex2 ?Regex3 ?Regex4 ?NumericValue 
          ?NumberItem wdt:P5176 []; wdt:P1181 ?NumericValue . 
          FILTER( ?NumericValue > 0 ) 
          FILTER( ?NumericValue < 151)
          BIND("^([^ ]+ ){" AS ?RegexStart)
          BIND("}([^ ]+) .*" AS ?RegexEnd)
          BIND( CONCAT( ?RegexStart , STR( ?NumericValue - 1 ), ?RegexEnd ) AS ?Regex1)
          BIND( CONCAT( ?RegexStart , STR( ?NumericValue + 1 ), ?RegexEnd ) AS ?Regex2) 
          BIND( CONCAT( ?RegexStart , STR( ?NumericValue + 3 ), ?RegexEnd ) AS ?Regex3) 
          BIND( CONCAT( ?RegexStart , STR( ?NumericValue + 5 ), ?RegexEnd ) AS ?Regex4) 
    } AS %regexes 
    { # Applying the regexes to the titles to extract ngrams (for n <= 8), and counting occurrences of the ngrams across titles
        DISTINCT ?Ngram 
        (COUNT(DISTINCT ?Title) AS ?Count)
        (( ?Count * ?Length * ( (?Dashes +1) / ?N) 
         ) AS ?Score)
        (SAMPLE(DISTINCT ?Publication) AS ?ExamplePub)
            INCLUDE %regexes
            INCLUDE %titles
                REPLACE(?Seeds, ?Regex1, "$1"), " ", 
                REPLACE(?Seeds, ?Regex1, "$2"), " ", 
                REPLACE(?Seeds, ?Regex2, "$1"), " ", 
                REPLACE(?Seeds, ?Regex2, "$2"), " ", 
                REPLACE(?Seeds, ?Regex3, "$1"), " ", 
                REPLACE(?Seeds, ?Regex3, "$2"), " ", 
                REPLACE(?Seeds, ?Regex4, "$1"), " ", 
                REPLACE(?Seeds, ?Regex4, "$2")
            ) AS ?NgramCandidate) 
                "([ ]{2,})"," ")
              ) AS ?Ngram) 
            BIND(STRLEN(?Ngram) AS ?Length) 
            FILTER (?Length > 3 )  
            FILTER (?Length <= ?ClearTitleLength )  
            BIND(STRLEN(REPLACE(?Ngram, "\\S", "")) + 1 as ?N)
            BIND((STRLEN(?Ngram) - STRLEN(REPLACE(?Ngram, "-", "")))  as ?Dashes)
      GROUP BY ?Ngram ?N ?Count ?Length ?Dashes ?Score ?ExamplePub
      HAVING(?Count > 1)
    } AS %ngrams 
    WHERE {
      INCLUDE %ngrams 
      # Exclude Ngrams starting or ending with any of a set of blacklisted words
      BIND("(a|and|between|during|for|from|in|of|on|or|the|to|with)" AS ?blacklist)
      BIND( CONCAT( "(^", ?blacklist ,")+( )+") AS ?RegexBlackStart)
      BIND( CONCAT( "( )+(", ?blacklist ,")+$") AS ?RegexBlackEnd)
      FILTER (!REGEX(?Ngram, ?RegexBlackStart))
      FILTER (!REGEX(?Ngram, ?RegexBlackEnd))
    #   # Exclude Ngrams too similar to the target
    #   FILTER (!CONTAINS(?Ngram, "climate"))
    #   FILTER (!CONTAINS(?Ngram, "change"))
      ?ExamplePub wdt:P1476 ?ExamplePubTitle.
    GROUP BY ?Ngram ?N ?Count ?Length ?Dashes ?Score ?ExamplePub ?ExamplePubTitle
    ORDER BY DESC(?Score) DESC(?Count) DESC(?Length)
    LIMIT 200

Potential things to work on edit

See also edit