Property talk:P4438

Latest comment: 1 year ago by GreenC in topic Not working anymore

Documentation

BFI Films, TV and people ID
identifier for a person or movie at BFI Film & TV Database. Format: 13 digits and lowercase letters
Associated itemBritish Film Institute (Q260528)
Applicable "stated in" valueBFI Film & TV Database (Q4835523)
Data typeExternal identifier
Corresponding templateTemplate:BFI (Q20089486), Template:BFI person (Q122964854)
Domainhuman (Q5), film (Q11424), television program (Q15416), television series episode (Q21191270), film production company (Q1762059), film series (Q24856), film studio (Q375336), group of humans (Q16334295), musical ensemble (Q2088357), fictional character (Q95074), group of fictional characters (Q14514600), brand (Q431289) or special purpose artist (Q59755569)
Allowed values[0-9a-f]{13}
ExampleTim Curry (Q52392)4ce2b9f1de72f
Alfred Hitchcock (Q7374)4ce2b9ee3449d
I, Daniel Blake (Q23823458)579dce2329553
The Pillars of the Earth (Q283073)4f4bb24fa0bfe
iCarly (Q3013)4ce2b8c964851
Sourcehttps://www.bfi.org.uk/films-tv-people/
Formatter URLhttps://web.archive.org/web/0/https://www.bfi.org.uk/films-tv-people/$1
Tracking: usageCategory:Pages using Wikidata property P4438 (Q55283095)
Related to country  United Kingdom (Q145) (See 329 others)
See alsoBFI Filmography person ID (P4326), BFI National Archive work ID (P2703)
Lists
Proposal discussionProposal discussion
Current uses
Total22,211
Main statement21,83098.3% of uses
Qualifier6<0.1% of uses
Reference3751.7% of uses
Search for values
[create Create a translatable help page (preferably in English) for this property to be included here]
Format “[0-9a-f]{13}: value must be formatted using this pattern (PCRE syntax). (Help)
List of violations of this constraint: Database reports/Constraint violations/P4438#Format, hourly updated report, SPARQL
Item “instance of (P31): Items with this property should also have “instance of (P31)”. (Help)
List of violations of this constraint: Database reports/Constraint violations/P4438#Item P31, hourly updated report, search, SPARQL
Distinct values: this property likely contains a value that is different from all other items. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P4438#Unique value, SPARQL (every item), SPARQL (by value)
Allowed entity types are Wikibase item (Q29934200): the property may only be used on a certain entity type (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P4438#Entity types
Scope is as main value (Q54828448), as reference (Q54828450): the property must be used by specified way only (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P4438#Scope, SPARQL
Label required in languages: en: Entities using this property should have labels in one of the following languages: en (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P4438#Label in 'en' language, search, SPARQL

Not working anymore

edit

This property is not working anymore in most of the cases. Only some IDs have been redirected to a new one. For example, the URL https://www.bfi.org.uk/films-tv-people/579dce2329553 (that is about I, Daniel Blake (Q23823458)) now redirects to https://www.bfi.org.uk/film/0e964f0b-e7aa-54e6-a6f3-544f18d14feb. Horcrux (talk) 09:47, 7 October 2023 (UTC)Reply

User:Horcrux, worked on this, on enwiki, for the past 3 days. Fixed over 16,000 links.
  • What I found was about a 90% dead link rate. Unclear why they kept those 10% working. The template:BFI on enwiki is currently up for deletion after converting to archive URLs (1,200 instances). The same template is deployed in other languages, see above "This property is being used by". Those templates need to be converted to archive URLs on each wiki, a huge job requiring bot perms on each.
  • As for Wikidata, because the original ID and database system is in-effect no longer in existence, it should be nuked from Wikidata to avoid confusion. BFI is creating a new database and IDs, but is very different from the old one. There is no map for the new IDs. It will take years or decades for the community to manually determine what the IDs are for each film and actor. (By which time BFI will destroy the DB and make a new one..)
  • As for the new IDs, here is an example URL: https://collections-search.bfi.org.uk/web/Details/ChoiceFilmWorks/150027925 .. there are 2 possible BFI identifiers: 150027925 and 23135. this page says the BFI identifier is 150027925, but the ChoiceFilmWorks page says it is 23135. I believe the BFI ID is 23135 because in Expert Search choose "BFI Reference Number" and enter 23135, it goes to the page for the work in the /ChoiceFilmWorks/150027925 URL. However entering a BFI Reference Number of 150027925 doesn't work. This is unfortunate because there is no way to map the BFI ID of 23135 to the URL of /ChoiceFilmWorks/150027925 .. it's like you need to know both numbers, one a BFI ID the other a "URL ID" - if that's what it is, and assuming that number even remains stable over time.
-- GreenC (talk) 05:14, 17 October 2023 (UTC)Reply
@GreenC: Thank you for your analysis. For what concerns Wikidata, I think the only thing that we can do right now is this. Indeed we are used to save IDs also for discontinued services (see here). When the situation is stabilized, we will create a new property. --Horcrux (talk) 07:10, 17 October 2023 (UTC)Reply
@Horcrux: Someone with API access to EIDR might be able to figure this out at scale. For example, the film Birds of a Feather, here are all the versions. The 1931 version shows the new BFI as 150047127 and a Wikidata of Q3569872. Excellent, simply search EIDR for all Wikidata IDs, then update Wikidata with the new BFI ID when EIDR has it. Then it would be possible to update Wikipedia links, because we also have the old BFI IDs in Wikidata. Same process, search Wikipedia for all old BFI ID and then search Wikidata for that old ID and from there retrieve the new ID and update the URL. Easier said than done, of course, but a path exists. -- GreenC (talk) 17:27, 17 October 2023 (UTC)Reply
I wasn't aware of this discussion until now. FYI yesterday I asked BFI if it's possible to prove a map from BFI identifier to ChoiceFilmWorks ID. It will be interesting to see what they say. From the above discussion above it looks there may be no simple map :-( Tobyhoward (talk) 12:24, 18 October 2023 (UTC)Reply
  • provide
Tobyhoward (talk) 17:51, 18 October 2023 (UTC)Reply

@GreenC: Great! I've also just realized that we have BFI National Archive work ID (P2703) for identifiers as 150047127, so we also have a direct mapping returned by the following query:

SELECT DISTINCT ?old_bfi_id ?other_bfi_id
WHERE {
  ?x wdt:P4438 ?old_bfi_id ;
     wdt:P2703 ?other_bfi_id ;
     #wdt:P2704 ?eidr_id  .
}
Try it!

Anyway, I think that such replacements should be done cautiously: if the old BFI source was providing some data, are we sure that the corresponding item on Collections Search reports the same information? Maybe in most cases even on Wikipedia the best thing to do could be linking the Wayback Machine's URL. --Horcrux (talk) 18:17, 17 October 2023 (UTC)Reply

Nice. Incidentally, notice the mix of ID types in the results. Some are the BFI URL id (10-digits), and some are the BFI Reference ID (4-7 digits). Do you think there should there be another Property for the Reference ID ie. BFI National Archive reference ID
Your query shows it is possible to make a map, we didn't have that before, but it's only 2,680 out of 10s of thousands in use on Wikipedia, because we are missing so many of BFI National Archive work ID (P2703). I read this doc (pg. 4) and it shows how to query EIDR via a GET API using the EIDR content ID (P2704) of which this query shows 72,000 items available on Wikidata. Of those, only 7,135 have a BFI National Archive work ID (P2703). So 90% of our records (that have a EIDR content ID (P2704)) are missing the BFI National Archive work ID (P2703). Yet they could easily be retrieved by GETing the metadata from EIDR. I would be happy to do this and provide a text file with the data, but I have no idea how to upload to Wikidata at scale. -- GreenC (talk) 01:43, 18 October 2023 (UTC)Reply
@GreenC: It would be pretty easy using QuickStatements, which takes in as input TSV/CSV files. Let me know if you prefer to use it or just upload the data somewhere else (in that case I could run the QS batch) ;-) Thank you! --Horcrux (talk) 07:24, 18 October 2023 (UTC)Reply
I'll give it a try. It's possible EIDR was populated with data from Wikidata, in which case it will be 7k or less also, but there is only one way to find out. -- GreenC (talk) 16:29, 18 October 2023 (UTC)Reply
User:Horcrux: After scouring the EIDR website, I was able to find some and uploaded the results via QuickStatements. There are now 8,859 with BFI National Archive work ID (P2703). Or an increase of 1,724 from the original of 7,135. Not very great, considering what might have been possible (100s of thousands). EIDR and Wikidata are already in sync.
The next step: search all Wikipedia languages for https://collections-search.bfi.org.uk URLs with the 9-digit work ID. For each (eg. https://collections-search.bfi.org.uk/web/Details/ChoiceFilmWorks/150032887 ) scrape the EIDR URL, then check that page for the Wikidata Q number (eg. https://ui.eidr.org/view/content?id=10.5240/1231-F93D-AD03-7C2E-B28C-7 ) then update the Q page P2703 with the number found in the original URL eg. 150032887 -- GreenC (talk) 04:12, 21 October 2023 (UTC)Reply
After the above process, there are now 8,926 or an increase of +67 .. disappointing results because I found 3,390 collections-search URLs across all of Wikipedia - apparently most of them have already been imported into Wikidata. I'm running out of ideas for increasing P2703. -- GreenC (talk) 18:03, 22 October 2023 (UTC)Reply
Return to "P4438" page.