Wikidata talk:WikiProject DNB

(Redirected from Wikidata talk:WPDNB)
Latest comment: 1 year ago by Charles Matthews in topic DNB27

New project, previous discussions edit

@Andrew Gray, Charles Matthews, Filceolaire, Visite fortuitement prolongée, Pigsonthewing, Magnus Manske: @Hsarrazin, DavidMar86hdf, Pengo, Haplology, Andre Engels, Harmonia Amanda:

There are currently discussions about this at:

Maybe this page can help sort things out. --- Jura 11:55, 25 March 2015 (UTC)Reply


Linking main subject (P921) based on Wikisource:Template:DNB00 edit

The template DNB00 at Wikisource generally includes a link to English language Wikipedia. This could be used to build main subject (P921) on Wikidata items for DNB entries (sample: Q15985730 has P:P921 = Q5481772).

Based on this, the template currently adds the following categories:

This could help for a start. --- Jura 14:13, 25 March 2015 (UTC)Reply

The same can be done with Wikisource:Template:DNB01, Wikisource:Template:DNB12. --- Jura 18:15, 25 March 2015 (UTC)Reply

Yes. Automation is possible, but with some caveats (I did a pass through those enWS to enWP links in 2014–5).
While most of those links are good, i.e. give you the correct main subject, there are reasons they might not:
  1. link can run to a redirect page;
  2. link can run to a disambiguation or list page;
  3. link can run to a person who has been wrongly identified;
  4. link can run to a page that would be instance of something other than human.

All these definitely can happen. In my pass through I tried to fix all cases of 1, 2 and 3. I didn't fix 4 because it wasn't obvious how. (Example: s:Orm (DNB00) links to w:Ormulum, and an article for Orm himself should not be created under enWP policy.) My idea, the next time I did a pass, was to create a category for case 4 on enWS, to aid its exclusion from bot operations.

Actually, knowing more about Wikidata now, I would say that the case Orm -> Ormulum -> Ormulum (Q2534421) -> instance of (P31) is "manuscript" or "literary work" is quite hopeful. Clearly in principle those cases can be found by following links, and so the Wikisource category could be populated by a bot.

Case 1 is a clear issue for a bot too, to replace the redirect by its target. Case 2: well, if the link has the enWP template {{hndis}} on it, again a bot could pick that up, and create a list of such pages for maintenance. That leaves an edge case: some of the links run to pages about families (which is part of #4), one hopes), but some are awkward lists where a few generations have the same name (happens for artists).

So, to sum up, main subjects could be added for quite a large number of cases, by automation. The number would be over 20,000. The run might be imperfect in just a few cases. Charles Matthews (talk) 10:33, 1 May 2015 (UTC)Reply

Item for published in (P1433) edit

Shall we use:

  1. Dictionary of National Biography (Q1210343)
  2. Dictionary of National Biography, 1885–1900 (Q15987216), etc.
  3. or another item

for P1433? I'd tend to go with (2). --- Jura 18:15, 25 March 2015 (UTC)Reply

I agree with (2). Specific edition is better than umbrella term. --Heron (talk) 16:54, 13 September 2015 (UTC)Reply

Item for instance of (P31) edit

Shall we use:

  1. biographical article (Q19389637)
  2. a DNB specific item
  3. or another item

for P31?

I'd tend to go with (1) and remove other items from P31 if present (samples that would be removed: article (Q191067), biography (Q36279), encyclopedia article (Q17329259) and human (Q5) etc.). --- Jura 18:15, 25 March 2015 (UTC)Reply

I would go with 1) Filceolaire (talk) 05:39, 26 March 2015 (UTC)Reply

I have switched to using "biographical article"; so agree. The items I have come across are marked "edition", which is technically correct. In fact the Wikisource text of the DNB is the first edition, while the text on the ODNB website marked "DNB archive" is a later edition (I think 1912, so also public domain, but can't be sure in all cases). So there is merit in having "edition" there. Charles Matthews (talk) 10:15, 1 May 2015 (UTC)Reply
Someone had gone through and added version, edition or translation (Q3331189) for instance of (P31) to many items which makes no sense so I switched some to encyclopedia article (Q17329259) as I found it to be the most specific, though now that I see this here I think having them equal to both encyclopedia article (Q17329259) and biographical article (Q19389637) is probably best. Hazmat2 (talk) 17:32, 14 May 2015 (UTC)Reply

Rename this project edit

I think we need a solution which works with all the reference works on all the language wikisources. Developing a solution which only works for DNB could be a problem. Filceolaire (talk) 05:39, 26 March 2015 (UTC)Reply

I had an idea five years ago about "Reference Commons", i.e. the subset of Wikisource devoted to reference works. It sounds as if we are thinking along the same lines. In any case my concept will probably be realised here, if anywhere.
What one means by "reference work" might vary, but what interests me is those works divided up by article. So there is a title (headword or phrase) and "main subject" should derive from the title (i.e. we assume the article is basically on-topic). Then the matching page here is set up to have "instance of [...] article", and "main subject = topic derived from title".
So far so good. The other typical statement is part of (P361) or published in (P1433). I quite like the former for the DNB, which has a first edition and then around eight supplements: one can be specific about the part, and those parts are all "published in" the DNB. Seems to be best of both worlds.
For other reference works, where do you see this possibly going wrong? Charles Matthews (talk) 10:53, 1 May 2015 (UTC)Reply
@Filceolaire: care to explain? --- Jura 02:15, 3 May 2015 (UTC)Reply
@Charles Matthews: The only issue I see with using published in (P1433) is that we don't need it (and I don't think that use is intended). I see published in (P1433), for example, as an article in a scientific journal, a poem in a compilation, or a short story in a magazine or literary journal. I think part of (P361) is perfect and the best of both worlds. The item is a part of a specific edition or supplement. In turn, that is an edition of DNB. Therefore, by relationship, using part of (P361) the item is already part of DNB as well. It's less work and gives the same data relationship that you're going for. Hazmat2 (talk) 17:41, 14 May 2015 (UTC)Reply

Examples edit

Charles Matthews, Hazmat2, Jura See the following.

Dixon, John (d.1715) (DNB00) (Q19083811)     
Dixon, Joseph (DNB00) (Q19083818)     
Which is better? Filceolaire (talk) 05:37, 15 May 2015 (UTC)Reply
Label
I prefer the label with the first name first, even if it is different from the name of the source page/DNB chapter. Filceolaire (talk) 05:51, 15 May 2015 (UTC)Reply
Agreeing with Jura (comment below) I prefer to copy the DNB title verbatim, as in "Smith, John", since the item is about the DNB article and not about the person. You wouldn't rearrange the words of a book title, so why do it to DNB articles?
Also, I hope we are all agreed that the many surname-only entries (such as Barry and its homonyms) are undesirable and should be fixed.--Heron (talk) 17:16, 13 September 2015 (UTC)Reply
Description
I am currently using just "biography", but I'm thinking of extending this to "biography in DNB00". --Heron (talk) 17:16, 13 September 2015 (UTC)Reply
Should we use edition or translation of (P629) or published in (P1433) or both?
I like John Dixon - use both but have P629 refer to the series and have P1433 refer to the (1895 - 1900) edition. Filceolaire (talk) 05:51, 15 May 2015 (UTC)Reply
Thinking about this as Hazmat2's comment below. I agree with Hazmat2 that edition or translation of (P629) is more appropriate to use on the wikisource item for the whole DNB rather than on each section. I don't agree with Hazmat2 that part of (P361) is the way to link to the item for the wikisource DNB. I think published in (P1433) is better for this. Filceolaire (talk) 15:12, 15 May 2015 (UTC)Reply
reference to the publication/volume/chapter as qualifiers or primary statements?
I think qualifiers looks better in Reasonator and makes more sense. If it is a bit harder for some clients to get the info this hardly matters as the info isn't that important. Filceolaire (talk) 05:51, 15 May 2015 (UTC)Reply
Just to add a quick note, qualifiers shouldn't be an issue. Some issues, if not most have been fixed, and eventually they will be completely. Hazmat2 (talk) 11:40, 15 May 2015 (UTC)Reply
  • Just FYI, both have the incorrect instance (version, edition or translation (Q3331189)). edition or translation of (P629) is also wrong as this is not an edition or translation of the book, but simply a part of it. I fixed it. My two cents on everything: I prefer first name first because that's often how people search, etc. Nonetheless, I use title (P1476) to put the article title in, whether it's worth it I don't know but it takes only a second. Though see how I used qualifiers at Dixon, Joseph (DNB00) (Q19083818) to basically do what you did. (I didn't change the title though.) As for the English description, the current one is misleading. I prefer to use the term entry or article (ie. Dictionary of National Biography entry; article in the Dictionary of National Biography) to distinguish between an actual biography and what amounts to a biographical sketch or biographical dictionary/encyclopedia entry. Genre isn't needed as it's already tied to DNB. Using it would be redundant (similar to how you don't use publication year). See the changes I made.
My example isn't necessarily perfect for DNB either, but you may wish to see John Veitch Shoemaker (Q19863068) for a simplistic approach. I did not use the qualifier for the section because it's only one volume and I'm already using the title. Hazmat2 (talk) 11:34, 15 May 2015 (UTC)Reply
Thanks Hazmat2. I agree on 'instance of' and 'title' but I think published in (P1433) is better than part of (P361). - see my comment above. Filceolaire (talk) 15:19, 15 May 2015 (UTC)Reply
That sounds good. As I wrote, I just view "published in" to refer to a different type of work than an encyclopedic/dictionary entry, but reading what you wrote got me thinking that "part of" is probably even less so. Hazmat2 (talk) 15:23, 15 May 2015 (UTC)Reply

Ralph of Coggeshall (DNB00) (Q19091989) and onwards referrals edit

At the enWS side we have a sometimes in and sometimes out with onwards referring DNB articles, eg. Ralph of Coggeshall (DNB00) (Q19091989). Now we have the issue that they are in WD, and how we classify them. Somewhere locally there is a P-property that allows reference to the primary article, and I am seeing feedback about which it is. The other option that we have is that we can just call those beasties at enWS non-notable, and request their delete here at WD.  — billinghurst sDrewth 01:11, 9 July 2015 (UTC)Reply

Review of matches edit

Please direct your attention to the following discussion: [2]. Jonathan Groß (talk) 14:45, 3 September 2015 (UTC)Reply

page rewritten. edit

I have

  • Added a section on why each DNB bio has a wikidata item separate from the item for the subject.
  • I have edited the sample items as the comments.

Joe Filceolaire (talk) 20:30, 3 September 2015 (UTC)Reply

There seems to be some disagreement about it: see User talk:SKbot. --- Jura 08:12, 26 November 2015 (UTC)Reply

Wikimania 2016 edit

Only this week left for comments: Wikidata:Wikimania 2016 (Thank you for translating this message). --Tobias1984 (talk) 12:09, 25 November 2015 (UTC)Reply

Main subjects now complete: state of the art edit

The addition of main subject (P921) statements to all the items here for the DNB has been complete for a little while now. (A SPARQL query written to check this should be returning an empty list, therefore. I get four items, all of which are done, so this is presumably some caching artefact.)

In other words, all the DNB articles A on English Wikisource now link to their own data item B, and this data item B now links to the Wikidata item C for the subject of the biography A. The completion of this task opens the way to various kinds of checking, maintenance and addition of statements, by bot.

I thought I would spell these bot tasks out here. Some potentially apply to similar works, such as Allgemeine Deutsche Biographie (Q590208). The fact that there are ODNB codes here as Oxford Dictionary of National Biography ID (P1415), and items have been created for that identifier, gives a further dimension.

One task is to complete the links back of type C -> B with described by source (P1343), as has already been done in about 70% of cases.

Bringing in other wikis, of particular interest to w:WP:WPDNB, is to investigate the cases where item C has an enWP link, to a Wikipedia page D. Then A should link to D via the DNB header field "wikipedia=". It is now possible for a bot to list all the cases where there is no link.

One can consider though also that A may link to D1 on enWP, while C links to D2, or to nothing. These cases need investigation and maintenance. Merges can be found this way. When everything is complete we have a definite way round a square D -> C -> B -> A -> D, with the link C -> B carried by the described by source (P1343) statement that is the converse to main subject (P921); and the page C should carry Oxford Dictionary of National Biography ID (P1415). It would be a major step forward now for this project to check everything.

Charles Matthews (talk) 12:04, 22 July 2016 (UTC)Reply

Thanks CM. It would also be useful if we could use SPARQL to see that two way author linking between author: ns pages and articles exists, similar to the tool that MM built locally years ago. Oh, is this also the case for the supplements? Up until last week I was still seeing missing pages linking through to the DNB01 articles.  — billinghurst sDrewth 14:51, 22 July 2016 (UTC)Reply

I went through DNB01 and DNB12 first; so they should be complete. But maybe the SPARQL query above assumes DNB00. (Currently - there was an older, slower tool by Andrew Gray that broke.)

For more checking without bots, I would like to get my head around the powers of Petscan when it uses a SPARQL query as one of the inputs. This should be potent.

As far as the matching of Wikisource to enWP goes, the essential thing is to be able to generate the list of items here that are relevant "main subjects" of unmatched DNB articles. If that were done, one-for-all, it could become a PagePile, for maintenance as we go ahead. I need to keep thinking about a botless way to generate such a list, in case there is one. Charles Matthews (talk) 08:21, 28 July 2016 (UTC)Reply

Update: prompted by the query on completeness for DNB01 and DNB12, I ran modified queries, and there were just a few missing there. All now done. Charles Matthews (talk) 08:55, 28 July 2016 (UTC)Reply

Maintenance with described by source (P1343) edit

I have been using a query that checks whether items carrying the statement described by source (P1343) Dictionary of National Biography, 1885–1900 (Q15987216) also carry Oxford Dictionary of National Biography ID (P1415), i.e. have been tagged with ODNB identifier. Many mis-matches and merges found this way.

They don't all: for example Aerated Bread Company (Q4687954) shouldn't carry the statement Oxford Dictionary of National Biography ID (P1415) John Dauglish (Q18546547), which would confuse the company with its founder.

That was a DNB00 check, of about 150 hits when first run, now returning 68, which can be looked at again; should generally be false positives though. The DNB01 version query and DNB12 version query currently return no hits. Charles Matthews (talk) 05:00, 2 August 2016 (UTC)Reply

ODNB completeness edit

A complete pass through the Oxford Dictionary of National Biography ID (P1415) identifiers has left the number of items tagged as 60,452. This compares with 58,552 exactly a year ago. The whole range from 101000001 to 101110999 has been checked, and while there may be some omissions still, the update has probably covered almost everything.

Some basic figures are that 59,975 are "instance of human", 28 instance of "fictional human"; and there are 53,369 males and 6,644 (i.e. 11%) females. The ODNB announced 60K biographies at the recent update (October). There are over 300 articles on families. Most of the rest comes from various other groups.

The next update to the ODNB will be during January. A script by User:Andrew Gray means we will in future be able to monitor completeness, in a way not possible until recently. Charles Matthews (talk) 20:34, 1 January 2017 (UTC)Reply

Hurrah! Good start to the year. Andrew Gray (talk) 12:25, 2 January 2017 (UTC)Reply

DNB27 edit

Just a note to say that Dictionary of National Biography, third supplement (Q116234546), published 1927, is on the verge of completion on enWS. That gives some 477 new articles which could do with Wikidata items. main subject (P921) statements should appear on all of them - old DNB maps into the ODNB, and Oxford Dictionary of National Biography ID (P1415) statements are supposed to be complete (apart maybe from recent additions). Charles Matthews (talk) 06:22, 29 March 2023 (UTC)Reply

Return to the project page "WikiProject DNB".