Scholia logoThis user uses Scholia.
QuickStatements logoThis user uses QuickStatements.
This user loves Wikidata.
1,114,594+This user has made over 1,114,594 contributions to Wikidata.

I am Roderic D. M. Page (Q7356570), you can find me on Twitter as @rdmpage, and I have a blog iPhylo which lists my current projects.

Things to fix edit

Taxa versus species edit

Norops duellmani (Q6450757) and Anolis duellmani (Q2814307) are the same taxon, in this example different wikis link to different names, and sometimes the page names don't match.

Essays edit

User:Rdmpage/Referencing taxon names

Wikidata edit

How to add references Help:Sources


Place where WikiCite-related stuff gets discussed. Wikidata talk:WikiProject Source MetaData

Deletions edit

See and

SELECT DISTINCT ?item ?itemLabel WHERE {

   SERVICE wikibase:mwapi
     bd:serviceParam wikibase:endpoint "".
     bd:serviceParam wikibase:api "Generator".
     bd:serviceParam mwapi:generator "links".
     bd:serviceParam mwapi:titles "Wikidata:Requests for deletions".
     bd:serviceParam mwapi:gpllimit "max".
     bd:serviceParam mwapi:gplnamespace "0".
     ?item wikibase:apiOutputItem mwapi:title.
   ?item wdt:P6944 ?id .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }


Property proposals edit

Ones I've made or have been involved in.




Data quality edit

See Wikidata:WikiProject_Data_Quality


See User:Succu/SPARQL for lots of relevant examples.

Deprecation edit

See Help:Deprecation. One example of using this would be to correct dates of articles where CrossRef has got the date wrong (e.g., Wiley metadata).

Withdrawn identifiers edit

e.g. if an ISSN has been cancelled use reason for deprecated rank (P2241) withdrawn identifier value (Q21441764)

Deprecated identifiers edit

See Haplostoma humesi, New Species (Copepoda: Cyclopoida: Ascidicolidae), Associated with a Compound Ascidian (Aplidium Sp.) from Madagascar (Q104118218) for an example with two DOIs.

Duplicates edit

Trying to clean up Zootaxa, first example The type specimens of Tachinidae (Diptera) housed in the Museo Argentino de Ciencias Naturales “Bernardino Rivadavia”, Buenos Aires (Q29469527) and The type specimens of Tachinidae (Diptera) housed in the Museo Argentino de Ciencias Naturales “Bernardino Rivadavia”, Buenos Aires (Q35800184)

Problem is that Revision of the neotropical Exoristini (Diptera, Tachinidae): the status of the genera Epiplagiops and Tetragrapha (Q35950285) cites both, but they are not merged in that references list of cited works.

I got bored and manually fixed the duplication.

Another test case: Revision of Zorion Pascoe (Coleoptera: Cerambycidae), an endemic genus of New Zealand (Q79429489) and Revision of Zorion Pascoe (Coleoptera: Cerambycidae), an endemic genus of New Zealand (Q28939048), where one has CrossRef DOI and is cited by A checklist of New Zealand Cerambycidae (Insecta: Coleoptera), excluding Lamiinae (Q56166058), the other has Zenodo DOI, has authors instead of author strings, and is linked to a taxon Zorion taranakiensis (Q14848194).


Zootaxa edit

Notes on de-duplicating Zootaxa. The counts here come from a local database I made, and will be out of date, especially as @Succu: is working through duplicates manually. "x" is not entered yet.

Year No. articles in WD at start Duplicates No. in CrossRef Query Notes
2001 17 0 20
2002 105 0 107
2003 258 0 269
2004 261 0 388
2005 424 1 583 Duplication is ZENODO DOI record Q79429489 of record Q28939048
2006 10 0 851 Q88363212 has only a Zenodo DOI
2007 167 0 1067
2008 22 0 1111
2009 19 0 1466
2010 59 0 1416
2011 47 0 1650
2012 43 0 1893 Q29470959 has two PMIDs both valid but clearly duplicates
2013 2144 1237 2123 Q29468684 had duplicate PMIDs, one wrong. Q30860840 has a DOI but CrossRef metadata is wrong (title is that of preceding article). A number of 2013 articles have DOIs that don't resolve and hence DOI field in Wikidata is not populated.
2014 2017 3 2027
2015 2338 0 2341
2016 2315 5 2334 Q28821788 has two PMIDs both valid but clearly duplicates
2017 2241 9 1859 Note that Wikidata has more than CrossRef, check what happened here.
2018 2231 814 2322
2019 2408 9 2505
2020 1242 0 1106

References edit

To add references for a statement in Wikidata using Quickstatements:

Q36504420 P21 Q6581072 S248 Q28948401

Note "S" instead of "P" for "stated in" property. To see the result in Wikidata see Andreja Kofol-Seliger (Q36504420).

Examples edit


Homonyms edit

Homonyms can be linked to replacement names, and to each other.

Authority control edit

Authority control

Geography edit

Wikidata:List of properties/geography

Can store GeoJSON in Wikicommons, e.g. Commons:Data:BioStor/ Need to create page manually then add data (create page via a red link).

Can link to Wikidata using geoshape (P3896)  

Need to figure out how to retrieve GeoJSON for use in applications.

See also Wikidata:Property_proposal/distribution_map_of_taxon

Taxonomic properties edit

See also Template:Taxonomy_properties., Wikidata:WikiProject Taxonomy, and Wikidata:WikiProject Taxonomy/Tutorial

problems edit

Wikidata conflates names and taxa, see wikipedia:User:Peter_coxhead/Wikidata_issues, Wikidata:Property_proposal/taxon_synonym_string, is it possible to resolve this?

For a summary of properties see Template:Taxa Versus Names.

properties edit

Properties of a taxon (Q16521)     

taxon name (P225)   taxon rank (P105)   parent taxon (P171)   taxon common name (P1843)  

taxon synonym (P1420)   taxon range map image (P181)   hybrid of (P1531)  

taxon author (P405)  

Examples: Synalpheus pinkfloydi (Q29367343)

geography edit

type locality edit

type locality (biology) (P5304)  

Locality must be a Wikidata item. Qualifiers include object named as (P1932)   and point in time (P585)   and coordinate location (P625)   (e.g., Solanum aspersum (Q1305990).

We can create a map of type localities: Try it!

nomenclature edit

nomenclature entities edit

protonym (Q14192851)      basionym (Q810198)     

taxon identifiers edit

ITIS TSN (P815)   Encyclopedia of Life ID (P830)   NCBI taxonomy ID (P685)   GBIF taxon ID (P846)   IPNI plant ID (P961)   Plant List ID (Royal Botanic Gardens, Kew) (P1070)   IUCN taxon ID (P627)   Tropicos ID (P960)   WCSPF ID (P3591)   Avibase ID (P2026)   MSW ID (P959)   BOLD Systems taxon ID (P3606)   MycoBank taxon name ID (P962)   Index Fungorum ID (P1391)  

page level link between name and BHL edit

Use BHL page ID (P687) as a reference for a taxon name (P225), e.g.

=== genome --

sequenced genome URL (P6800) for link to genome, e.g. Asian tiger mosquito (Q477918) has

Taxonomic examples edit

User:Achim_Raschka/Erstbeschreibungen has a list of new species descriptions, see also query that generates a timeline: [1]

Nice examples for literature mapping edit

Interesting cases edit

Names that aren't taxa edit

There are cases where names have an entry but they are not instances of taxa, e.g Satsuma chalybeia (Q25351070) which is described as a "name that may not be used" and is an instance of unavailable combination (Q17487588) synonym (Q1040689) and Satsuma Murray (1874) non Adams (1868) (Q25661771)(!). Looks like an attempt to separate names from taxa...

What this also means is that we can't rely on simply searching for things that are instances of taxon (Q16521) when looking to match names.

Traits edit

JSTOR edit

JSTOR content on Internet Archive

Major journal projects edit

Acta botanica Boreali-Occidentalia Sinica (Q27721266) edit and CNKI. DOIs for some articles but they don't seem to be resolving. For example has DOI which doesn't resolve.

Acta botanica Boreali-Occidentalia Sinica
Years Volumes Source Pagination? Notes Status
1981-2021 CNKI Yes adding
1999-2009 Wanfang Yes DOIs 10.3321 ISTIC Added
2012-2013 Wanfang Yes DOIs 10.3969 ISTIC Added
2014-2016 Wanfang Yes DOIs 10.7606 ISTIC Not added yet

Acta Botanica Yunnanica (Q5683696) edit

Several name changes, multiple data sources, Chinese and English, lack of pagination data in some cases, mixed DOI agencies, not all DOIs resolve. PDFs available. Oh the fun we will have...

Can use Internet Archive to extract page numbers e.g., SOME NEW TAXA OF OLEACEAE FROM TIBET,CHINA (Q106563578) from

One challenge is the overlap between Wanfang volumes 21-26 and Wanfang has pages and DOIs, doesn't have either, but does have PDFs. So we need to somehow crosslink these :(

Acta Botanica Yunnanica
Years Volumes Source Pagination? Notes Status
1999-2010 21-32 Wanfang Data Yes DOIs, mostly 10.3969 but some 10.3724 Added using scraped data, then map to URLs, some manual editing of DOIs
2011-2015 33-37 Wanfang Data Yes PLANT DIVERSITY AND RESOURCES, DOIs 10.3724
1979-2020 1-42 Mostly (2005 onwards, before that, no) All called Plant Diversity, some DOIs 10.1016, 10.7677, 10.3724, every article has a URL Use to add Plant Diversity And Resources (2011-2015), and enhance Wanfang data 1999-2010.
1979-2015? Example CNKI No, nor does it have volumes Incomplete coverage, use to enhance by add CNKI ids where possible.
2016-2020 38-42 Elsevier Yes Plant Diversity, DOIs 10.1016 Add from CrossRef
DOI prefixes
Prefix Who Agency Link
10.7677 wangfangdata ISTC
10.3969 wangfangdata ISTC
10.1016 crossref crossref
10.3724 Crossref - Science Press crossref

Acta Entomologica Sinica (Q21386079) edit

Lots of articles already, some with CNKI DOIs. Need to explore further, and add PDFs and other links.

It looks like the links I had in BioNames are now gone, so need to remap. Stevenliuyi did some amazing work adding articles with CNKI DOIs, but these lack volume numbers and pagination, so need to add those. I'd also like to archive and link to the PDFs. Rdmpage (talk) 11:50, 21 July 2021 (UTC)

Acta Mycologica (Q27724065) edit

Note that DOI dates are often incorrect, they look to be dates article went online, not when it was published!

Acta Palaeontologica Sinica (Q15746639) edit

CNKI has issued DOIs for the complete journal (as far as I can determine). The journal also has a hoe page that has PDFs and also includes DOIs. The HTML has citation_ tags but poorly implemented with author footnotes included, no DOI or pagination, etc.

OK, more complicated. CNKI has complete journal, but DOIs have been issued by different sources.

Acta Palaeontologica Sinica
Years Volumes Journal DOIs Notes
1953-1998 1 - Acta Palaeontologica Sinica CNKI DOI 10.19800/j.cnki.aps...
1999-2009? Acta Palaeontologica Sinica ITISC Wanfang DOI 10.3969/J.ISSN.0001-6616.2009.04.003
2010- 49- Acta Palaeontologica Sinica CNKI DOI 10.19800/j.cnki.aps...

Acta Phytotaxonomica et Geobotanica (Q100375972) edit

Name changes, ISSN changes, etc. Also in CiNii.

Acta phytotaxonomica et geobotanica
Years ISSN Source DOIs Notes
1932-2001 0001-6799,2189-7050 J-Stage 10.18942/bunruichiri... JaLC Acta phytotaxonomica et geobotanica / 植物分類, 地理 Acta Phytotaxonomica et Geobotanica (Q100375972)
2001- 1346-7565,2189-7042 J-Stage 10.18942/apg... JaLC Acta Phytotaxonomica et Geobotanica (APG) / 植物分類,地理 Acta Phytotaxonomica et Geobotanica (Q5656888)
1979-2020 2189-7034,1346-6852 J-Stage 分類 : bunrui : 日本植物分類学会誌 / Bunrui Bunrui (Q40186046)

Acta Phytotaxonomica Sinica (Q5656885) edit

Acta Phytotaxonomica Sinica (Q5656885) and Journal of Systematics and Evolution (Q15733644): multiple web sites, multiple DOIs, etc. See Acta Phytotaxonomica Sinica: A Bibliographic Summary of Published Volumes (Q28955370) for some earlier history. See for title change.

Acta Phytotaxonomica Sinica
Years Volumes Journal DOIs Notes
2009- 47- Journal of Systematics and Evolution CrossRef DOI 10.1111/ ISSN 1674-4918
2008-2009 46 Journal of Systematics and Evolution ? DOI 10.3724/SP.J... ISSN
2005-2007 43-45 Acta Phytotaxonomica Sinica CrossRef DOI 10.1360 ISSN 0529-1526
1951-2004 1-42 Acta Phytotaxonomica Sinica

Acta Zoologica Sinica (Q105241277) edit

ISSN 0001-7302, needs lots of work, became Current Zoology (Q15749150). Local database (publications) has CNKI URLs such as which now break, can be rewritten as, which also has page information.

Also links (but no DOIs) in Wanfang Also links in CQVIP e.g.

Acta Zootaxonomica Sinica (Q15761826) edit

Name changes, ISSN changes, etc. Acta Zootaxonomica Sinica (Q15761826) and Zoological Systematics (Q21386166). Lack of pagination data. Lots of articles already added by @Stevenliuyi: with CJFD journal article ID (P6769) identifier, but these include old articles in Acta Zootaxonomica Sinica (Q15761826) linked to new name for journal Zoological Systematics (Q21386166).

I've started to move the pre-2014 articles to Acta Zootaxonomica Sinica (Q15761826) and adding the Wanfang DOIs. Pagination have to be added later.

2020-11-14 Danger Will Robinson! CNKI and Wanfang article numbering is different, so can't rely on simply mapping Wanfang URL to CNKI even though they look very similar :(. Will need to be cleverer about the mapping...

Acta Zootaxonomica Sinica
Years ISSN Source DOIs Notes
1964- 1000-0739 CNKI Acta Zootaxonomica Sinica (Q15761826)
1998-2013 1000-0739 WanFang 10.3969/j.issn.1000-0739... Not all have DOIs
2014- 2095-6827 10.11865/zs. DOIs don't seem to resolve?

American Museum Novitates (Q4744472) edit

This journal has suffered from having BHL treat issues as titles, so that Wikidata has separate items for articles that have poor metadata and aren't linked to the journal. Seems to be restricted to pre-1923 content, will need to check. Given that BHL continues to do this, will need to spend some time linking to newer BHL and IA content...

  • I've linked most(all?) stray pre-1923 articles to Handles and the journal itself. 2020-12-18

Annales de Parasitologie Humaine et Comparée (Q21384994) edit

Many articles with PMID but no DOI (and also badly translated English titles). See for query for PMID but no DOI. I have matched PMIDs to records locally, need to move this to Wikidata.

Annales Zoologici (Q15761023) edit

Article A revision of the genus Heliophanus C. L. Koch, 1833 (Aranei: Salticidae) (Q60864671) has a bad Internet Archive ID (P724) as it contains []. Need to fix.

The Bryologist (Q7720447) edit

We have lots of BioOne DOIs for articles before 2000, which is when BioOne first has content for this journal. Prior to 2000 all DOIs that resolve to content are JSTOR DOIs. Looks like BioOne feed CrossRef a bunch of DOIs for which it never had content, hence CrossRef metadata has lots of duplicate DOIs. Some of these are in Wikidata... what a mess.

I've added JSTOR DOIs to those pre-2000 articles that only had "fake" BioOne DOIs, and added all JSTOR-DOI content prior to 2000 Rdmpage (talk) 13:46, 4 April 2022 (UTC)

Bulletin de la Société entomologique de France (Q21385878) edit

Note that we also have some BHL DOIs for this journal, and also there are a small number of articles in Horizon.

There is overlap between Persee and CrossRef from 2018 points to, landing page is volume not individual srticle.

So we have Persee and overlap 2008-2016, 2017 is just, then 2018 - CrossRef

Bulletin de la Société Entomologique de France
Years Source DOIs Notes
1896-2016 Persee Some articles from 1896 have DOIs 10.3406/bsef. Post 2016 are embargoed so no metadata, all PDFs have a captcha in front of them.
2008- PDFs freely available (most recent under embargo)
2018- CrossRef 10.32475/bsef.

Bulletin of the Osaka Museum of Natural History (Q21385090) edit

Some have DOIs e.g. Five new species of the genus Trichotichnus from Taiwan (Coleoptera, Carabidae, Harpalini) (Q111523062) 10.20643/00001606 which leads to a repository

Bulletin of Botanical Research (Q55758940) and Bulletin of Botanical Research (Q5735529) edit

Need to sort out ISSNs and history

Also bad things have happened involving mismatch between metadata (titles and pages) and Internet Archive PDFs. Need to review all articles and sort this out :(

For example Cloning and Gene Expression of 3-Hydroxy-3-Methylglutaryl-CoA Synthase Gene( AsHMGS ) from Aquilaria sinensis (Lour.) Gilg (Q96107615) title doesn't match first line or pagination, and IA PDF is for different article.

Note that there are duplicate DOIs, e.g. 10.7525/j.issn.1673-5102.2007.05.002 appears in the metadata for two articles!

Wrong or inconsistent metadata in Wikidata edit

  • Bad
  • Fixed/OK

Q96107547 Q96107548 Q96107551 Q96107552 Q96107555 Q96107556 Q96107557 Q96107559 Q96107568 Q96107570 Q96107571 Q96107573 Q96107574 Q96107576 Q96107578 Q96107580 Q96107581 Q96107582 Q96107590 Q96107592 Q96107593 Q96107594 Q96107597 Q96107599 Q96107601 Q96107606 Q96107607 Q96107611 Q96107613 Q96107615 Q96107626 Q96107710 Q96107886 Q96107896 Q96107978 Q96108014 Q96108078 Q96108079 Q96108091 Q96108099 Q96108284 Q96108340 Q96108438 Q96108512 Q96108519 Q96108655 Q96108753 Q96108770 Q96108786 Q96108839 Q96108857 Q96108909 Q96108977 Q96109101

Wrong PDF in IA edit

For some articles the contents of the PDF in IA are wrong, e.g. Q96107529 Q96107530 Q96107532 Q96107535 Q96107536 Q96107528

I've made all these items in IA dark.

Wrong IA metadata edit

I've deleted DOIs when I've found a mismatch between metadata and DOI.

Bulletin of the United States National Museum edit

Bit of a mess with multiple items of different type, see discussion

Based on BHL, ISSN and SuDoc

Name ISSN BHL Sudoc Years
Bulletin - or / United States National Museum 0362-9236 169510 037366866 1907-1971
Bulletin of the United States National Museum 0096-2961 169509 036653233 1875-1905

Note also on BHL "No. 1-16 issued also in: Smithsonian miscellaneous collections, v. 13, 23-24."

Bulletin of the United States National Museum
Name Description Years ISSN BHL Notes
Bulletin of the United States National Museum (Q21385133) version of volume 1 of a journal 1877-1971 0362-9236 BHL:169509 (1875-1905)

118 publications linked to this item 1889-1969

Bulletin of the United States National Museum (Q21385329) scientific journal (1875–1905) 1875–1905 0096-2961

Nine publications linked to this item 1886-1970

Bulletin of the United States National Museum (Q56634273) volume 1 of a publication

has edition or translation (P747) Bulletin of the United States National Museum (Q21385133) , no external identifiers

Bulletin of the United States National Museum (Q56633969) publicaton of the United States Museum

has part(s) (P527) Bulletin of the United States National Museum (Q56634273) no external identifiers but linked to which spans 1879-1971

Entomologicheskoe Obozrenie (Q4532102) and Entomological Review (Q47161189) edit

Looks like (Q4037789) may have made some mistakes linking translations together, may have to check against Springer website for Entomological Review (Q47161189). Rdmpage (talk) 17:11, 29 January 2023 (UTC)

Japanese Journal of Ichthyology (Q21385442) edit

Some confusion between Japanese Journal of Ichthyology (Q21385442) and Ichthyological Research (Q15760079) (partly caused by Springer). These are two separate journals. Metadata for Japanese Journal of Ichthyology (Q21385442) retrieved via DOIs often lacks titles, will need to add manually. Also need to add titles in multiple languages via harvesting web site.

Journal of China Agricultural University (Q27438849) edit

Online but will require some effort to get list of articles

Journal of Japanese Botany (Q5946705) edit

Three articles so far, note that Japanese Species of Parmelia Ach. (sens. str.), Parmeliaceae (Q59123248) is a composite of several articles with the same title (i.e., it is a series of papers).

Journal of The Asiatic Society of Bengal (and Materials for a Flora of the Malayan Peninsula) edit

This journal contains Materials for a Flora of the Malayan Peninsula, which has also been issued as a separate reprint.

There is a scanned archive at South Asia Archive (Q104412505) that is behind a paywall, there are also freely accessible PDFs at

Journal of The Asiatic Society of Bengal
Name Years ISSN BHL Notes
Journal of the Asiatic Society of Bengal. Part 2. Natural History (Q2840714) 1871(?)-1936 Treated as a separate publication by Wikidata and IPNI, Wikidata says it "replaced" Journal of the Asiatic Society of Bengal (Q16584125) whereas is was a separate part ("2") that continued on after the journalJournal of the Asiatic Society of Bengal (Q16584125) into journal and proceedings.
Journal of the Asiatic Society of Bengal (Q16584125) 1832-1905 (1936 for part 2) 0368-1068
Proceedings of the Asiatic Society of Bengal (Q41298163) 1865-1904 0369-8416 Proceedings of the Asiatic Society of Bengal
Journal and proceedings of the Asiatic Society of Bengal (Q51496795) 1905- 0368-3451 new ser., v.1 onwards, note that the Proceedings are at the end of the volume, and some of the archive indexes the proceedings (e.g., ). southasiaarchive has new volume numbering from 1906
Journal of the Asiatic Society 1935 to 1950's(?) Volume change in southasiaarchive for 1935
Journal of the Asiatic Society (Q27716010) Vol. 1, no. 1 (1959)- 0368-3303
Materials for a Flora of the Malayan Peninsula Reprint, see and for details and advice on how to cite. There are various versions in BHL

Items about the journal include Systematic notes on Asian birds. 51. Dates of avian names introduced in early volumes of the Journal of the Asiatic Society of Bengal (Q89633372)

Items about specific articles include XIV.—Notes on the œconomy of the Paussidæ, extracted from Capt. W. J. E. Boyes' Paper, published in the Journal of the Asiatic Society of Bengal (No. 138.—N. S. No. 54) (Q99848588)Items about Materials for a Flora of the Malayan Peninsula Materials for a Flora of the Malayan Peninsula (Q13554331) (generic item for work) and Materials for a flora of the Malayan Peninsula (Q51383079) (BHL copy). Articles mentioning it Materials for a Flora of the Malayan Peninsula (Q64285277) Materials for a Flora of the Malayan Peninsula (Q64276618) (in Nature)

Mycologia (Q1962302) edit

Multiple DOIs, duplication, redirects, etc. Plus we have items from PubMed that don't have DOIs. In short, a clusterfuck.

Suggested work flow:

Map DOIs 10.3852 to Wikidata, use php doi_to_doi.php to get redirect DOIs, map those to Wikidata, then see if we need to add extra DOIs to those items (i.e., 10.3852 and 10.1080)
Map PMIDs to Wikidata for references that lack DOIs in Wikidata then add those DOIs to Wikidata
Map JSTOR to Wikidata, add JSTOR ids if missing from Wikidata
Merge records with different DOIs and update corresponding Wikidata items
Add any missing records to Wikidata

Issues edit

Using redirection to find the "other DOI" for 10.3852 uncovered 25 items that are duplicates, i.e., both DOIs have a Wikidata item:

Q33292380 Q54802190
Q31120184 Q105485138
Q31042045 Q63854883
Q39437650 Q60401823
Q34491252 Q110615535
Q58803921 Q81300049
Q28295469 Q56485068
Q33309119 Q110827263
Q31139254 Q57254285
Q56915900 Q31111921
Q60449006 Q51709251
Q28301216 Q59678579
Q31130104 Q56931476
Q22255400 Q34626370
Q56145227 Q31111919
Q28276701 Q60895113
Q58045983 Q39098297
Q64385287 Q31081978
Q51119147 Q57448972
Q111373678 Q51188492
Q111262623 Q80758057
Q31120961 Q59205306
Q58656219 Q82464853
Q56951379 Q34399871
Years Source DOIs Notes
1909-2004 JSTOR 10.2307 Resolve to JSTOR
2005-2016 Now on T&F 10.3852 DOIs like 10.3852/mycologia seem to be redirects to 10.1080
1909-2022 T&F 10.1080 Complete coverage, so many duplicates of JSTOR content
1945-2020 PubMed Some missing DOIs Sporadic, many lacking DOIs

These duplicates have now been merged.

Then looked at items with PMID but no DOI in time period 2005-2016, these seem to be articles with no DOI at all.

Next issue is to identify those records that have only one DOI in Wikidata. These correspond to item with 10.3852 DOIs, a Wikidata item, and no Wikidata item for the 10.1080 DOI. I have added the missing DOIs to these records (they now have 2 DOIs).

This leaves us with DOIs for this time period that have no Wikidata item at all. These will be added.

DOIs from 2017 onwards are just T&F so these can be added directly.

Records for 1909-2014 will have to be merged where they have two DOIs, and we also need to check for PubMed-only records, of which there are a number. PMIDs will link both DOIs, so will need both DOis added, then add missing DOI for records with one DOI, then add missing records (and extra DOIs).

Details edit

Progress: volume 96

-- Look at a volume SELECT guid, title, volume, spage, epage, doi, wikidata, pmid, pii FROM publications_tmp WHERE issn='0027-5514' AND volume=96 ORDER BY CAST(spage AS SIGNED);

-- add DOIs to records with PMID but no DOI SELECT CONCAT(pii, char(9),'P356', char(9), '"', doi , '"') FROM publications_tmp WHERE issn='0027-5514' AND volume=96 and wikidata is null and pmid is not NULL ORDER BY pii;

-- Get wikidata for DOIs with no PMID SELECT guid FROM publications_tmp WHERE issn='0027-5514' AND volume=96 and wikidata is null and pii is null;

-- Get any articles with newer DOIs that we should add SELECT * FROM publications_tmp WHERE issn='0027-5514' AND volume=96 and wikidata is null and pii is null and doi like '10.1080%';

PubMed errors edit

There is a problem with volumes 60 and 61 in PubMed, some articles are assigned to the wrong volume :( Fixed Rdmpage (talk) 08:46, 15 April 2022 (UTC)

Mycosystema (Q15760108) and Mycosystema (Q52380050) and Acta Mycologica Sinica (Q52380146) edit

CNKI, see also

History of the journal in =

Years Journal ISSN Volumes Source Pagination? Notes Status
1982-1996 Acta Mycologica Sinica 0256-1883 1-15 magtech and CNKI Yes DOIs 10.13346 Not added yet
1997-2003 Mycosystema 1007-3515 16-22 magtech no no DOIs(?) Not added yet
2004-2013 Mycosystema 1672-6472 23- magtech and CNKI Yes DOIs 10.13346 Not added yet

Muséum National d'Histoire Naturelle journals edit

See 1802–2018: 220 ans d'histoire des périodiques au Muséum (Q93462644) and Timeline of the scientific publications of The Museum for details.

Records of the Auckland Museum edit

Special:Contributions/Prosperosity has made some edits to this journal and merged the older records I added (JSTOR) to newer ones they created. I will need to update my local mapping between JSTOR ids and Wikidata to accommodate this.

  • Done

Russian Journal of Genetics / Genetic edit

Wikidata has these journals confused. For example Gene diversity for haptoglobin and transferrin classical markers among Hindu and Muslim populations of Aligarh City, India. (Q48605442) is stated as being published in Russian Journal of Genetics (Q15753063) (the English language journal) with a link to PubMed, but PubMed says it is published in Genetika which is the Russian journal (which lacks a Wikidata item). Russian Journal of Genetics (Q15753063) has three ISSNs, 1022-7954, 1608-3369, 0016-6758, the last one (0016-6758) is for Genetika (Moskva) Need to unpack this journal and link articles to correct journal. Note that the English language articles will (mostly? all?) be translations of the articles in Genetika.

For the case of the article "Gene diversity for haptoglobin and transferri..." the Wikidata record is from GENETIKA but Wikidata links this to Russian Journal of Genetics. A bit of a mess :(.

Gene diversity for haptoglobin and transferrin classical markers among Hindu and Muslim populations of Aligarh City, India (Q48605442).
Journal DOI PMID URL pages
Russian Journal of Genetics 10.1134/s1022795411060044 47(6): 744-748
GENETIKA ГЕНЕТИКА - 21866866 47(6): 842-846

Sichuan Journal of Zoology (Q21385429) edit

Some articles with DOIs, PDFs available, website is a bit sluggish. Has English and Chinese metadata.

Taxon (Q2003024) edit

There is a big gap in DOI coverage where we have JSTOR ids but no DOIs. These DOIs exist for Wiley content, so need to match DOIs to existing Wikidata records (from JSTOR).

Volumes to do are 57-61, see

Lepidoptera Science (Q21385526) edit

Journal has several names, not sure of the timing of each, e.g. Transactions of the Lepidopterological Society of Japan, Tyô to Ga, Lepidoptera Science.

My first import generate a number of duplicates as I didn't check that DOIs were unique before adding them (doh!). This resulted in 1348 duplicates which I am merging.

I also want to link these articles to their CiNii identifiers.

Vestnik Zoologii (Q21385272) edit

Volume 42 (2008) onwards has CrossRef DOIs, although there are issues with their resolution. Prior to 2008 lots of articles online, not all with volumes, etc.

Zoological Research (Q15766889) edit

Many articles already in Wikidata, mix of DOIs (not all work), also some coverage in Wanfang and CNKI. Wikidata coverage is based primarily on PubMed. Note that there are at least three different DOI agencies, and some overlap in DOIs and/or agencies

Two ISSNs (and two Wikidata items) Zoological Research (Q15766889) 0254-5853 and Zoological Research (Q27714095) 2095-8137

According to NLM "Began with Volume 37, issue 5 (18 September 2016).",5&SC=Title&SA=Zoological%20research&PID=e3kbuLJdLaFS0PmiKWaW_rCl&SEQ=20210406112547&SID=2 but this is not entirely clear from the journal itself. For example, the cover for "Volume 35 Issue 5 18 September 2014" (3516) says ISSN 2095-8137 whereas "Volume 35 Issue 3 18 May 2014" has ISSN 0254-5853. 2014 also seems to be the year that the DOIs have mixed ISSNs. What a mess. Cover of vol 35 issue 4 18 July 2014 has ISSN 2095-8137 and also DOIs with that ISSN, so I think that os the issue when the ISSN changed.

Note that CrossRef and Wanfang DOIs overlap in volume 29! Also looks like not all Crossref DOIs work :(

Added most of 0254-5853 vols 1-34,35 still need to add Chinese titles to post vol 29 as many are Pubmed English translations. Will need to import Chinese titles and store in multilingual as current code uses DOI as GUID and hence misses the Chinese titles. Rdmpage (talk) 13:47, 2 May 2022 (UTC)

Zoological Research
Years Volumes DOIs ISSN Notes
2017- 38- CrossRef 2095-8137 DOI 10.24272/...
2014-2017 35-38 CNKI 2095-8137 DOI 10.13918/j.issn.2095-8137
2013-2014 34-35 ? 0254-5853 DOI 10.11813/j.issn.0254-5853 (broken)
2004-2021 25-42 ? ? Bioline
2008-2013 29-34 CrossRef 0254-5853 DOI 10.3724/...
1999-2008 20-29 ISTIC 0254-5853 DOI 10.3321/j.issn: Wanfang
1980-1998 1-19 0254-5853 No DOIs

Multilingual titles edit

If you have a Chinese title (e.g., "西北植物学报") and a transliteration (e.g., "Xibei zhiwu xuebao") then you can connect the two using pinyin transliteration (P1721). See Acta botanica Boreali-Occidentalia Sinica (Q27721266).

Quote from Contributions to the botanical journal Sunyatsenia from 1930 to 1948 (Q28944969)

"Names of Chinese botanists follow the convention of placing the family name first followed by given names; names of Westerners follow the western convention of placing the family name last. Chinese botanists mostly followed the Wade-Giles system of Romanization when transliterating their name; and the current pinyin system was initiated late in the careers of most of the early Chinese botanists around 1950s. They were not required to adopt the pinyin system if they had actively published and were known under a different transliteration of their name."

Wade-Giles is Wade-Giles (Q208442). This means we may need to take some care in handling Chinese names for older literature.

Titles with HTML markup edit

title (P1476) shouldn't have any markup, but you can add a qualifier to the title title in HTML (P6833) to include the markup. For example, see Sur le genre Trypanoxyuris (Oxyuridae, Nematoda) IV. Sous-genre Trypanoxyuris parasite de Primates Cebidae et Atelidae (suite) Étude morphologique de Trypanoxyuris callicebi n. sp. (Q64173850).

Full text edit

Note document file on Wikimedia Commons (P996) e.g. for A new cryptic species of Anolis lizard from northwestern South America (Iguanidae, Dactyloinae) (Q58700998) which essentially embeds a PDF in Wikidata!

Checksums edit

Maybe add checksum (P4092) as a property to a publication, as a way to link (indirectly) to content, see also and by Ben Trask (Q63232898).

Author matching and related issues edit

Matches without series ordinal edit

Note that User:EvaSeidlmayer has added author (P50) to lots of references without adding series ordinal (P1545), and leaving author name string (P2093) in place, so we have two entries for the same author, one as a thing and one as a thing (see e.g., North American distribution ofEleocharis mamillata(Cyperaceae) and confusion withE. macrostachyaandE. palustris (Q100395512) ).

Redirects edit

Wikispecies edit

Sometimes we have authors (or other entities) that two Wikidata items (e.g., two links to Wikispecies) when there is really only one entity (e.g., one person). An example is Eduardo Flórez Daza (Q21392863) and Eduardo Flórez Daza (Q56650857). These are the same person, and the Wikispecies entry for Eduardo Flórez Daza is a redirect to Álvaro Eduardo Flórez-Daza. The convention for this seems to be:

In this case Eduardo Flórez Daza (Q21392863) is now mostly empty and data on this person can be found at Eduardo Flórez Daza (Q56650857).

Authors edit

When merging authors, e.g. John William Thieret (Q102229589) with John William Thieret (Q21390395) the expectation is that a BOT will update every link to Q102229589 to point to Q 21390395. This process seems to take a long time. Telegram chat suggests 8 days Q102229589 was made a redirect 2023-07-19, the links were updated 2023-07-27 by User:KrBot (eight days later).

Bibliographic relationships edit

Reviews edit

Could use for reviews of books, etc.

Translations edit

Two New Species of the Weevil Genus Mecysmoderes Schoenherr, 1837 (Coleoptera, Curculionidae: Ceutorhynchinae) from Vietnam (Q99837830) in Entomological Review (Q47161189) is the English language version of ДВА НОВЫХ ВИДА ДОЛГОНОСИКОВ РОДА MECYSMODERES SCHOENHERR, 1837 (COLEOPTERA, CURCULIONIDAE: CEUTORHYNCHINAE) ИЗ ВЬЕТНАМА (Q99838137) in Entomologicheskoe Obozrenie (Q4532102). How do we represent this relationship?

OK, we can use edition or translation of (P629) and its inverse has edition or translation (P747) to link the two works together. Maybe should also make translated article and instance of version, edition or translation (Q3331189).

Errata edit

The article The first African record of Artolenzites acuta comb. nov. (Basidiomycota, Polyporaceae) (Q99931585) has an erratum Erratum to: The first African record of Artolenzites acuta comb. nov. (Basidiomycota, Polyporaceae) (Q99888822). To connect an article to its errata we use corrigendum / erratum (P2507) as a property of the original article, hence we have Q99931585 -- P2507 --> Q99888822

Note that there are bots that automatically add instance of (P31) erratum (Q1348305) to erata (see history of Erratum to: The first African record of Artolenzites acuta comb. nov. (Basidiomycota, Polyporaceae) (Q99888822) ).

Note also that User:Trilotat has some useful; queries to find corrections and the things they correct,_errata_and_corrigenda

PDFs edit

To add a PDF for an article use full work available at URL (P953), add file format (P2701) Portable Document Format (Q42332) as a qualifier to say that it is a PDF, and add archive URL (P1065) with a link to the URL in the Wayback machine if it has been archived there. See Trithecoides, a new subgenus of Culicoides (Diptera: Ceratopogonidae) (Q89666437) for an example.

Books edit

One model is the "book" is written work (Q47461344) which has basic information (title, author) and OCLC work ID (P5331) as an identifier (for example). Then. we have editions version, edition or translation (Q3331189) that have ISBNs (e.g., ISBN-10 (P957), Google Books ID (P675) etc. Editions are linked to works by has edition or translation (P747), works are linked to editions edition or translation of (P629). The Wikidata:WikiProject Books wants every book to have both written work (Q47461344) and at least one version, edition or translation (Q3331189), which seems redundant for many cases. For now I use Google Books to add books and by default make them written work (Q47461344). I follow Wikidata:WikiProject Books if there are multiple editions that seem important (e.g., they are cited).

Wikisource edit

See for example The Afghan War (Q19077572).

A version, edition or translation (Q3331189) has document file on Wikimedia Commons (P996), linking to a file on Commons, and Wikisource index page URL (P1957) which is the link to the Wikisource page for the transcription of the book.

Chapters edit

A chapter (Q1980247) is part of (P361) a book, and the book should list each chapter as has part(s) (P527), see for example The Canterbury Tales (Q191663)

Citations edit

In Ridleyandra merohmerea (Gesneriaceae), a new species from Kelantan, Peninsular Malaysia (Q42258926) I explored adding citations without DOIs as strings using unknown (Q24238356). See also proposal by GerardM for a citation string Wikidata:Property proposal/cites work string.

On the basis of this (unsuccessful) proposal GerardM has been exploring adding citations to cites work (P2860) using placeholder for "somevalue" (Q53569537), see for example Can trophic rewilding reduce the impact of fire in a more flammable world? (Q57805204).

Given that Quickstatements struggles with placeholder for "somevalue" (Q53569537) we will need to look at using the API to edit these statements directly (using unique statement ids). Need to be able to"

  • add a citation that lacks an item (with qualifiers)
  • retrieve details of citation that lacks item so we can try and add or match it
  • update citation that lacks an item with corresponding item

API experiments Q102901875 and Q102902439.

Unstructured citations edit

Querying for unstructured citations:

select * where {
  # Ridleyandra merohmerea ...
  VALUES ?work { wd:Q42258926 } .
  # Outsized effect of predation...
  # VALUES ?work { wd:Q102058694 } .

  # Get cited works  
   ?work p:P2860 ?statement . 
   ?statement ps:P2860 ?cites . 
   # stated as
   OPTIONAL { ?statement pq:P1932 ?unstructured . }
   # series ordinal
   OPTIONAL { ?statement pq:P1545 ?position . }

   # title
   OPTIONAL { ?statement pq:P1476 ?title . }

   # author name string
   OPTIONAL { ?statement pq:P2093 ?authors . }

   # publication date
   OPTIONAL { ?statement pq:P577 ?date . }

   # DOI
   OPTIONAL { ?statement pq:P356 ?doi . }

    # URL
   OPTIONAL { ?statement pq:P953 ?url . }

  FILTER (!isIRI(?cites))
ORDER BY (xsd:integer(?position))

Bibliographic identifiers (and proposals) edit

(see also Template:Bibliographic_properties )

Handle ID (P1184)  

Zenodo ID (P4901)  

CJFD journal article ID (P6769)  

WoRMS source ID (P6678)

BHL edit

Wikidata:WikiProject BHL

Bibliographic licenses including text mining edit

Could add information on licensing when adding works via CrossRef, would need to create items for each license, see e.g. for links to various licenses that could be used as templates. For example,

See also

Examples edit

Bibliographic harvesting, RSS edit

RSS feed edit

web feed URL (P1019)

OAI edit

URL (P2699) OAI endpoint, qualifier protocol (P2700) Open Archives Initiative Protocol for Metadata Harvesting (Q2430433)

Publishing engines edit

software engine (P408) Open Journal Systems (Q1710177)

Engines for taxonomy journals

select * { ?journal wdt:P31 wd:Q5633421 . ?journal wdt:P1476 ?title . ?journal wdt:P408 ?engine . ?engine rdfs:label ?label . FILTER(LANG(?title) = "en") FILTER(LANG(?label) = "en") ?article schema:about ?journal . FILTER(regex(str(?article), "")) ?journal wdt:P495 ?country . ?country wdt:P625 ?coordinates . } LIMIT 10

Things to fix edit

New species and new records of ant-eating spiders from Mediterranean Europe (Araneae: Zodariidae) (Q104465474) has the same DOI cited many times, but this is an error as each references is different! So we have massive duplication of works cited.

Other edit

(see also Template:Bibliographic_properties and Wikidata:WikiProject_Books)

IUCN conservation status (P141)   BHL page ID (P687)  

zoological specimen (Q2114846)     

Babel user information
en-N This user has a native understanding of English.
de-1 Dieser Benutzer beherrscht Deutsch auf grundlegendem Niveau.
zh-1 这位用户的中文达到初级水平
fr-0 Cet utilisateur n’a aucune connaissance en français (ou le comprend avec de grandes difficultés).
ja-0 この利用者は日本語分かりません (または理解するのがかなり困難です)。
vi-0 Thành viên này hoàn toàn không biết tiếng Việt (hoặc rất khó khăn để hiểu).
pt-0 Este utilizador não compreende português (ou compreende com dificuldades consideráveis).
es-0 Este usuario no tiene ningún conocimiento del español (o lo entiende con mucha dificultad).
cs-0 Tento uživatel nerozumí česky (nebo rozumí se značnými problémy).
th-0 ผู้ใช้คนนี้ไม่มีความรู้เกี่ยวกับภาษาไทย (หรือเข้าใจได้ด้วยความยากลำบาก)
ru-0 Этот участник не владеет русским языком (или понимает его с трудом).
ar-0 هذا المستخدم ليس لديه معرفة بالعربية (أو يفهمها بصعوبة بالغة).
ko-0 이 사용자는 한국어모르거나, 이해하는 데 어려움이 있습니다.
sk-0 Tento užívateľ nerozumie po slovensky (alebo rozumie so značnými problémami).
nl-0 Deze gebruiker heeft geen kennis van het Nederlands (of begrijpt het met grote moeite).
ms-0 Pengguna ini tidak mampu bertutur dalam (atau sukar memahami) bahasa Melayu.
id-0 Pengguna ini tidak memiliki pengetahuan bahasa Indonesia (atau memahaminya dengan sangat sulit).
Users by language