Wikidata talk:External identifiers
Proposal for expansion of properties
editHi All,
I would like to propose an update on the use of statements regarding external identifiers in order to make the future use of these IDs on Wikidata and on Wikipedia easier. Besides expanding the properties for IDs, the goal would be to start a dialogue and create guidelines for creating and describing external IDs.
Some of the proposed properties would be new, some already exist but would need to be agreed on to be used.
Though many of these could be added to the subject item of the property, these are often non-existent and I believe it would be easier for a future Wikipedia template to use these IDs if they contain all the information.
Existing but not used with IDs
editProperty | Description | Example | Comments |
---|---|---|---|
title (P1476) | Multilingual sites have different names for each language. (Svenskt kvinnobiografiskt lexikon vs Biographical Dictionary of Swedish Women) | Currently there are a few examples where multi-lingual titles are defined in Wikidata item of this property (P1629) or in source website for the property (P1896) as qualifier. | Doesn't have to be used, just put Labels in more languages. title (P1476) is needed only for "formal" title statements --Vladimir Alexiev (talk) 08:53, 27 November 2019 (UTC) |
main subject (P921) | main topic the database covers | USHMM Holocaust Encyclopedia ID (P3724) --> main subject (P921): The Holocaust (Q2763) | |
copyright license (P275) | license under which this copyrighted work is released | World Encyclopedia of Puppetry Arts ID (P7012) --> copyright license (P275): Creative Commons Attribution-ShareAlike (Q6905942) | |
online access status (P6954) | qualifier for an ID property indicating whether linked content is directly readable online | Values: | Currently has a constrain to be used for references as qualifier |
last update (P5017) | date a reference was modified, revised, or updated | World Encyclopedia of Puppetry Arts ID (P7012)--> copyright license (P275): Creative Commons Attribution-ShareAlike (Q6905942) | Currently has a constrain to be used for references as qualifier |
- Of course it would be nice to add such props to databases (note: databases, not the corresponding external-id) --Vladimir Alexiev (talk) 08:52, 27 November 2019 (UTC)
To be created or modified
editProperty | Description | Comments |
---|---|---|
archive URL formatter | entries archived in the Internet Archive (in case original site of ID goes offline or for IDs only found archived | This way deleted sites wouldn’t be lost (this one for example has many items added on Mix’n’match: [1] vs [2]). It could also encourage creating properties for online encyclopedias which only exist in archived form. third-party formatter URL (P3303) could also used for this purpose. See also: archive URL (P1065), archive date (P2960) |
update status | If the site is active or an old relic. Values: active (updated), abandoned (or sth), complete, offline, archived | |
catalogue type | Values: encyclopedia, library authority file, virtual exhibition item, film database, digital library etc. | This could be achieved by expanding on Wikidata property for an identifier (Q19847637) subclasses like Wikidata property related to encyclopedias (Q55452870) |
data subject type | Values: person, location, taxon, artwork, misc., etc. | Could be achieved with a new property or Wikidata property for an identifier (Q19847637) subclasses |
content type | Values: text, data, image, film | |
text type | In order to create a property which helps identifying sources which can serve as further reading in a Wikipedia article or as basis for new article creation. Usually these are encyclopedias, lexicons but some museum or gallery sites also have detailed artist biographies. Values: informative, data | |
print version | Wikidata item for the print version of an online encyclopedia | third-party formatter URL (P3303) could be used |
developer (or publisher?) | Institution or person responsible for the site and its content. Example: Encyclopedia of Alabama ID (P6010) → Alabama Humanities Foundation (Q30257855), Auburn University (Q540672) |
See also: maintained by (P126), operator (P137), sponsor (P859) |
developer institution type | Values: museum, university, library, private, community, company | This could be used for quality assessment, it shows what kind of institution is responsible for the data. |
external identifiers connected | What other catalogues are integrated or referenced in the data on this site. For example the beacon links in Bach Digital ([3], the references in Deutsche Biographie ([4]) or the library authority IDs contained in VIAF. Perhaps this could help with data mining or Mix'n'match identification? | See also: VIAF component (Q26921380) |
--Adam Harangozó (talk) 14:39, 14 November 2019 (UTC)
- @Adam Harangozó: I think most of these are props of the associated database, not the external-id. A nice exception is "archive URL formatter" but it relies on the WHOLE external site being archived on the same date, and I'm not sure archive.org can give such guarantees? In any case, you need to propose new properties one by one and they go through a discussion/vetting process --Vladimir Alexiev (talk) 08:58, 27 November 2019 (UTC)
- @Vladimir Alexiev: I know but the associated database/external-id distinction is quite a mess at the moment, that's why I would like to create a set of guidelines for external IDs before proposing any new properties. For example most of the time there is no item for the associated database, then at gallery collections they are set for the gallery/museum but to those you can't add the properties listed above (National Gallery of Victoria artist ID (P2041)). I'm not sure if it would make sense to make a separate item for the online databases of galleries. Another example where there is a separate item for the ID and its dictionary: US Congress Bio identifier (Q20205343) This is why I think it would be better to use these statements in the properties themselves, then all the necessary information could be added and would be in one place (which would be useful for creating Wikipedia templates using Wikidata). Adam Harangozó (talk) 17:26, 28 November 2019 (UTC)
- @Adam Harangozó: If you start adding detailed props about a database/website, you better make an entity for that database/website. If you just lump them together at the Identifier, it will be confusing, so people will (rightly) object to your property proposals. Eg if you describe "print version" to apply to encyclopedias, then you cannot apply it to any identifier because most identifiers are NOT about encyclopedias. --Vladimir Alexiev (talk) 09:05, 29 November 2019 (UTC)
- @Vladimir Alexiev: I know but the associated database/external-id distinction is quite a mess at the moment, that's why I would like to create a set of guidelines for external IDs before proposing any new properties. For example most of the time there is no item for the associated database, then at gallery collections they are set for the gallery/museum but to those you can't add the properties listed above (National Gallery of Victoria artist ID (P2041)). I'm not sure if it would make sense to make a separate item for the online databases of galleries. Another example where there is a separate item for the ID and its dictionary: US Congress Bio identifier (Q20205343) This is why I think it would be better to use these statements in the properties themselves, then all the necessary information could be added and would be in one place (which would be useful for creating Wikipedia templates using Wikidata). Adam Harangozó (talk) 17:26, 28 November 2019 (UTC)
Incorporation of external identifiers not yet meeting requirements
editIn South Africa the Education Department manages their schools with EMIS codes, but is currently publishing them all as XLS files. In the foreseeable future they will have something fulfilling the formatter URL requirement. Up until then it'd be great to keep track of diffs and reconcile with their IDs. How do you suggest to approach this? -- YaguraStation (talk) 22:40, 15 July 2020 (UTC)
Suggestion for required values for external identifiers
editI suggest that we add that external identifiers always should have either
or
That would make it much clearer when reviewing whether an item is notable or not when it has an identifier. Perhaps this should even be in the template for property proposals for external identifiers, so it gets added on creation and that any potential discussion has already taken place. Ainali (talk) 16:41, 10 May 2022 (UTC)
- Sounds good to me - not sure if it would work in the template as I think this is the same for identifiers and general properties, but definitely makes sense as a recommendation. Andrew Gray (talk) 21:46, 10 May 2022 (UTC)
- Support of course. --Epìdosis 22:32, 19 May 2022 (UTC) P.S. @Ainali: Not 100% sure if loading these on instance of (P31) (as we currently do) is the best possible solution, some IDs (e.g. Property:P268) already have more than 5 instances also not counting these ... maybe should we think of a specific property? --Epìdosis 22:35, 19 May 2022 (UTC)
@Adam_Harangozó: thanks for the very useful page documenting external-ids!!!
You write in the intro
- Wikidata External Identifier properties should have dedicated items to represent their values and they should link to those using class of property value (P10726).
- But I think the prevailing current practice is to have an item that represents the database and link from it using "wikidata property".
- This is promulgated by the prop proposal template, which calls this item "Represents"
- Could you comment on my claim above, or edit the intro to reflect this practice?
Thanks! Vladimir Alexiev (talk) 16:01, 9 October 2023 (UTC)