Open main menu

Wikidata:WikiProject Source MetaData


Other languages:
Bahasa Indonesia • ‎Deutsch • ‎English • ‎dansk • ‎español • ‎français • ‎italiano • ‎polski • ‎українська • ‎اردو • ‎العربية • ‎中文 • ‎日本語

WikiProject Source Metadata

WikiCite: creating a shared bibliographic repository for all Wikimedia projects
The word Metadata in Wikidata Morse code.svg
Topics co-occurring with Zika virus (Q202864)
Usage history of some key Wikidata properties around bibliographic and citation data as of January 2019. Live results here

The aim of of the WikiProject Source Metadata is:

  • to act as a hub for work in Wikidata involving citation data and bibliographic data as part of the broader WikiCite initiative.
  • to define a set of properties that can be used by citations, infoboxes, and Wikisource.
  • to map and import all relevant metadata that currently is spread across Commons, Wikipedia, and Wikisource.
  • establish methods to interact with this metadata from different projects.
  • to create a large open bibliographic database within Wikidata.
  • to reveal, build, and maintain community stakeholdership for the inclusion and management of source metadata in Wikidata.

There have been various proposals over the years for similar projects (see meta:WikiCite for details). Now that Wikidata is here, we can make it happen.

Current activitiesEdit

Ongoing importsEdit

PropertiesEdit

See this subpage for more details.

ProjectsEdit

ExamplesEdit

Timeline (from 1952 till early 2016) of Wikidata items with publication date (P577) and with main subject (P921) being set to Zika virus (Q202864) and/ or Zika fever (Q8071861), as per this Wikidata list
References in a Wikipedia article

Here is an example that creates a reference list with the articles

based on the following code:

  • {{#invoke:Cite | reflist | Q14405740 Q13416617 Q20058533 Q15567682 }}

This results in

  • Wulf D. Schleip and Mark O'Shea, "Annotated checklist of the recent and extinct pythons (Serpentes, Pythonidae), with notes on nomenclature, taxonomy, and distribution", ZooKeys, vol. 66, 66, , doi: 10.3897/ZOOKEYS.66.683, PubMed Central ID: 3088416 , Creative Commons Attribution 3.0 Unported
  • Stefan Martin Schmid, Bernhard Fügenschuh and Eduard Kissling, "Tectonic map and overall architecture of the Alpine orogen", Swiss Journal of Geosciences, vol. 97, 1, , doi: 10.1007/S00015-004-1113-X
  • Rudolf Jung, "Uffenbach, Zacharias Konrad von", Allgemeine Deutsche Biographie, 39th volume, vol. 39,
  • Eric N. Rittmeyer, Allen Allison, Michael C. Gründler, Derrick K. Thompson and Christopher C. Austin, "Ecological guild evolution and the discovery of the world's smallest vertebrate.", PLoS ONE, vol. 7, 1, , doi: 10.1371/JOURNAL.PONE.0029797, PubMed Central ID: 3256195 , Creative Commons Attribution 2.5 Unported
  • «ГОСТ» examples:

    TasksEdit

    For a list of specific tasks and todos (missing data, missing properties, cleanup tasks) see /ToDo

    Workflow for profiling researchersEdit

    How to create a scholarly profile for a researcher in Wikidata

    1. Consider the platform
      1. Visit Wikidata
        1. Wikidata is the database which anyone can edit
        2. The Wikidata community curates this data
      2. Consider Wikicite
        1. Wikicite is the community project within Wikidata which curates source metadata
        2. The Wikicite community is a subset of the Wikidata community
      3. Consider how anyone accesses data
        1. Scholia is the specialized Wikidata tool for viewing academic profiles of people, topics, universities, etc
          1. If a profile looks good in Scholia, then the data is correctly formatted to be maximally open and accessible in Wikidata and the Semantic Web
          2. Making a profile look good in Scholia is the quickest and easiest way to format data once and for all
        2. the Wikidata Query Service is the general Wikidata tool for viewing groups of Wikidata content
        3. Everyone else, including big tech, big publishing, big government, etc scrape Wikidata and reuse this content, so what is in Wikidata goes everywhere else
    2. Identify or create the Wikidata item for the researcher to profile
      1. use basic Wikidata search by the person's name
        1. if the item for the person exists, then use it
        2. if the item does not exist, then create it
          1. follow the instructions for creating a profile for a human in Wikidata:WikiProject Biographies
          2. add enough information to uniquely identify this person by name and a few other characteristics
        3. If there is ambiguity because multiple people have the same name and characteristics, then create a new item. Items can be merged, and merging duplicates is easier to fix than separating mixed items.
    3. Try to add the ORCID, which is a unique scientific identifier
      1. visit https://orcid.org/
      2. search for the researcher
      3. if there is an easy and obvious match, then grab the ORCID
        1. go back to Wikidata
        2. click "add statement", enter ORCID, paste the ORCID, publish
        3. run ORCIDator, a Wikidata tool to import ORCID data into Wikidata
          1. Access through the SourceMD tool - https://www.wikidata.org/wiki/Wikidata:SourceMD
          2. further documentation at https://www.wikidata.org/wiki/Wikidata:ORCIDator
      4. there is often no ORCID, or the ORCID is blank, or there is ambiguity - pass if this is the case
    4. Use the "Wikidata Author Disambiguation tool"
    5. This will match papers indexed in Wikidata to the target researcher
    6. https://tools.wmflabs.org/author-disambiguator/
      1. Enter the target researcher's name
        1. in 2019 the tool is clunky
        2. try name variations, including initials, or whatever is likely in an academic paper
      2. Identify name variations
        1. go back to the Wikidata item for the person
        2. add the variations to the "also known as" field at the top of the item
        3. noting the variations greatly assists ongoing maintenance and profile updates
    7. Wait
      1. Wikidata is a nonprofit project of the Wikimedia Community
      2. technical infrastructure is modest; in 2019 updates typically take 5-30 minutes
      3. Like Wikipedia, Wikidata depends on volunteer contributors of content and donor funding
      4. thanks for editing, it is the most valuable contribution anyone can make
    8. View incomplete profile on Scholia
      1. enter the person's name - it should autocomplete
      2. profile generated based on available data
    9. use Scholia's "missing content" tool
      1. this is weird - access by adding "/missing" at the end of the scholia URL
      2. the missing tool is actually a collection of tools which search and suggest possible data to add to the profile
      3. building out the network of collaborators is easy from here
        1. consider building profiles for top co-authors
        2. consider building profiles for people who commonly cite target researcher's papers

    Possible Data CollaboratorsEdit

    ContentMine presentation, Wikimania 2014. Wikiwish: "An Open Bibliography of science, updated daily" (the first bulletpoint at 27:30)
    Citing as a public service: presentation by User:DarTar at the 2015 Wikipedia Science Conference pitching Wikidata as an open bibliographic and citation data repository

    Some possible Data Collaborators have expressed interest on working on source metadata in Wikidata: others might usefully be approached.

    OCLC, which runs WorldCat, is very keen on collaborating with Wikidata; User:Maximiliankleinoclc wrote a letter about the possibilities.

    ContentMine has some excellent open software tools, which we could use to let Wikidata answer queries like "List all the review papers ever written on malaria vaccines", "List all the articles that mention Lygodactylus williamsii", "List every paper ever written by John Tuzo Wilson" and "List all the papers cited in Wikipedia articles that have been retracted". They listed "An Open Bibliography of science, updated daily" as a "wikiwish" at Wikimania 2014, apparently unaware that this project has been started at a slightly earlier workshop.

    PLOS has an API for RichCitations, which contains metadata on all PLOS papers up through late 2014. Rich Citations is a novel structured format to express each citation as a data element, and it includes a set of useful, additional terms specific to scholarly literature that enable research about the knowledge web citations create. It also includes a display feature much like Reference Tooltips, but linked to a database (which is open licensed), so it can update metainformation. They presented at Wikimania 2014 and are keen to collaborate and share their results with us.

    Zotero is interested in the idea of a proofread metadata source. Some Zotero users currently upload to cloud storage; we might build tools to let them upload here, instead. CiteseerX has a large open-licensed database of article metadata, and might want to set up an exchange, but have not responded to e-mails.

    The Cochrane Collaboration is developing an API to its metadata (they were contacted about this project in July 2014, so this use case may have helped shape the API). They produce large amounts of non-conventional metadata on works they review, and on works they produce, both of which Wikimedians quote.

    Institutional repositories are also increasingly interested in open APIs and linked databases, and seem generally receptive to this project. The university-run academic search engine BASE aggregates and normalizes these repositories and makes its data collection available for non-commercial purposes.

    ResourcesEdit

    SubpagesEdit

    The following subpages belong to the project:

    ContactEdit

    ParticipantsEdit

    The first list has now reached the maximum number possible for {{Ping project}}. Please therefore add your name to the second list below.

    To ping both lists, use {{Ping project|Source MetaData}} and {{Ping project|Source MetaData/More}} in two different posts.


    The participants listed below can be notified using the following template in discussions:

    {{Ping project|Source MetaData}}

    The participants listed below can be notified using the following template in discussions:

    {{Ping project|Source MetaData}}

    Historical discussionsEdit

    There have been historical discussions about Wikidata hosting information about the sources of data.

    See alsoEdit