Open main menu

Wikidata:Dataset Imports/Unpaywall

Guidelines for using this pageEdit

Documenting the importEdit

  • Guidelines on how to import a dataset into Wikidata are available at Wikidata:Data Import Guide.
  • Please include notes on all steps of the process.
  • Once a dataset has been imported into Wikidata please edit the page to change the progress status from in progress to complete.
  • It is strongly recommended to use Visual Editor when making changes to this page, particularly for editing any of the tables.

Creating a Wikidata item for the datasetEdit

  • Please create a Wikidata item for the dataset, this will allow us to improve the coverage of datasets on Wikidata and understand what datasets are available on that topic and which of them have been added to Wikidata.
  • If you are working with very large dataset you can break it into smaller Mix n' Match catalogues, but only create one Wikidata item.
  • Link the dataset Wikidata item to this page using Wikidata Dataset Imports page (P5195)

Getting helpEdit

  • If your dataset import runs into issues please edit the page to change the progress status from in progress to help needed.
  • You can ask for help on Wikidata:Project chat.


Dataset nameEdit






Dataset descriptionEdit

data format A data set linking DOIs to freely available versions of the article. This includes preprints (e.g. arXiv) and institutional repositories.

Additional informationEdit

Progress of importEdit

The table below is used to track the progress of importing this dataset. The suggested column headings are most applicable to data being imported from a spreadsheet - you can change some column headings or add new columns as required to best describe the progress of this import.

Data sets used:Edit

SubtaskProcess importStructure of data within WikidataMatch the dataset to WikidataImporting data into Wikidata
Import arXiv identifiers
  • Code prepared
  • Initial data set (138k items) processed
Automatic detection and decoding of honey bee waggle dances. (Q47562874)
  • Using wd_doi_ids.ndjson.gz provided by
  • Mapping by case-insensitive doi matching
Export using QuickStatements (Q20084080)

Edit historyEdit

Use the table below to list batches of edits that have been completed for this dataset. Ideally each entry should have all applicable columns filled out, but at a minimum please make to add a date and description to give an idea of what was added to Wikidata and when.

DateDescriptionMethodPropertiesQualifiersReferencesStatements addedStatements removedLink to import sheet
27 Oct 2018Initial arXiv importSee abovearXiv ID (P818) -Unpaywall (Q38352586)72170[2]
Date 3Description 3Method 3Properties 3Qualifiers 3References 3Added Count 3Removed Count 3Link 3

Discussion of importEdit

These headings are generally useful, please change this section to suit your needs.

Wikidata item for datasetEdit

Import data into spreadsheetEdit

Format the spreadsheet to import the dataEdit

Structure of data within WikidataEdit

Field nameWikidata propertyNotes

Match the dataset to WikidataEdit

Importing data into WikidataEdit

Import completion notesEdit



Queries and expected resultsEdit

Query linkDescriptionExpected results

Schedule of new data releasedEdit