Wikidata:WikiProject Heritage institutions/Data sources
Sources to Import Data From
editThe table below contains an overview of the data sources to import data about heritage institutions from (listed by country). Try to be as comprehensive as possible when listing potential data sources of a given country.
For a more comprehensive overview, see the list of datasets that have been identified in the context of the FindingGLAMs project.
Note: These overview tables should eventually be merged.
Country | Publisher | Database | Coverage | Wikidata item | Mapping information | Rights | Contact | Comments | Ingestion Status |
---|---|---|---|---|---|---|---|---|---|
CH | OpenGLAM CH | Swiss GLAM Inventory | All heritage institutions in Switzerland (at the exception of smaller municipal archives) | Swiss GLAM Inventory (Q26933296), Swiss GLAM Inventory, 16 September 2016 (Q27477970) | CH | CC-0 | User:Beat Estermann | Database compiled from various data sources. | Selected data fields for all items have been ingested (sample query). |
CH | Swiss Museums Association | Swiss Museums Database | All presently existing museums in Switzerland fulfilling a minimum requirement regarding accessibility to the public | to be clarified | David Vuillaume (Swiss Museums Association) | Data partly included in the Swiss GLAM Inventory. The Swiss Museums Association is ready to give access to its API. | |||
CH | Swiss National Library | Swiss ISIL Inventory | All the libraries and partly also archives (and museums?) in Switzerland | to be clarified | Data partly included in the Swiss GLAM Inventory. Scheduled to be published as linked open data in 2017. | ||||
CH | infoclio.ch | Inventory of Swiss Archives | Selection of archives in Switzerland, including a description of their holdings | to be clarified | Enrico Natale (infoclio.ch) | Data partly included in the Swiss GLAM Inventory. | |||
CH | arCHeco | Register of economic fonds preserved in archives of Switzerland and Liechtenstein | Archives in Switzerland with economic fonds | to be clarified | Daniel Nerlich (Archiv für Zeitgeschichte) | Data partly included in the Swiss GLAM Inventory. | |||
CH | Federal Office of Civil Protection | Swiss Inventory of Cultural Property of National and Regional Significance (Q869941) | Contains not only information about historical monuments of regional or national significance in Switzerland, but also heritage collections of regional or national significance (and the corresponding institutions). | Swiss Inventory of Cultural Property of National and Regional Significance (Q869941) | CC-0 | Data partly included in the Swiss GLAM Inventory. | Scheduled to be ingested in early 2017. | ||
CH | Federal Office of Culture | List of monuments protection services and archaeological services | Cantonal and municipal monuments protection services and archaeological services in Switzerland | none | Simple web page. Data partly included in the Swiss GLAM Inventory. | ||||
CH | Swissbib (University of Basel) | Inventory of Research Libraries in Switzerland | All research libraries in Switzerland | CC-0 | Does not only include heritage institutions; the heritage institutions should all be included in the Swiss GLAM Inventory.Might be available as LOD in the meanwhile. | ||||
US | IMLS | List of museums | Museums listing based on a 2015 survey | ? | ids, with complex visualizations attached related to demographics | ||||
International | ISIL | Various, listed at http://biblstandard.dk/isil/ | International Standard for describing Libraries, which includes other institutions | ISIL (P791) | |||||
MX | xx | Directorio - Bibliotecas de la DGB de la Red Nacional de Bibliotecas, de la Secretaria de Cultura en formato CSV | Public Libraries in Mexico | CC-BY |
Describe the Data Sources on Wikidata
editDatabases suitable for data ingestion into Wikidata should be described on Wikidata itself. For each database, an item needs to be created on Wikidata, so that the database can be cited as a source when ingesting data into Wikidata. Refer to Help:Sources for information about how to use sources in Wikidata. When it comes to importing data about heritage institutions, the types of data sources most commonly found are:
- Online databases: In the case of an online database, a Wikidata property should be created that corresponds to a unique identifier used to refer to items in the database. For the database itself, a Wikidata item should be created. Refer to Help:Sources#Databases for further guidance. Example: <insert an example of an online database here.>.Note: In fall 2016, it was impossible to properly reference online databases using the Quick Statements Tool, a tool commonly used to batch ingest data into Wikidata. As of December 2016, a new version of the tool is under development. <Check the next release of the tool and update this note if necessary.>
- Database dumps / database exports: In the case of a database dump or a database export, the source file is typically available in a spreadsheet format (e.g. CSV or Excel) or in a hierarchical format (e.g. XML). In this case, a Wikidata item should be created for the database itself (example: Swiss GLAM Inventory (Q26933296)) and for the specific export file (example: Swiss GLAM Inventory, 16 September 2016 (Q27477970)).
- Simple web pages: In some cases you may find lists of heritage institutions on simple web pages. If the source is a simple website, it does not need to be described as a separate Wikidata item. Refer to Help:Sources#Web page for further information about how to source statements to simple web pages.
Unique Identifiers
editBefore ingesting data into Wikidata, we usually want to make sure that our source database contains a unique identifier which can later be used to match the data on Wikidata with the data in the source database. This is particularly useful in the case of future updates to the source database. There are two commonly used approaches to ingesting such unique identifiers into Wikidata:
- The use of a single-source identifier: In this case, a particular Wikidata property is created that corresponds to a unique identifier used in a given database. Example: Elvis Presley (Q303) has a Library of Congress authority ID (P244) : "n78079487". Typically, identifier properties have their corresponding Wikidata entity, in the case of our example: Library of Congress Authorities (Q13219454).
- The use of a multi-source identifier: In this case, a particular Wikidata property and a Wikidata item are used that are generic for a particular domain; multi-source identifiers should be unique in combination with a qualifier. Example: The item Judith Holding the Head of Holofernes (Q17319619) has an inventory number (P217) : "SK-A-1" that is further qualified by collection (P195) : Rijksmuseum (Q190804). In this case, the collection and the inventory number combined form a unique identifier (the inventory number on its own does not have to be unique across collections). The corresponding Wikidata entity in this example is: accession number (Q1417099).
With regard to the data sources concerning heritage institutions, it is recommended to use single-source identifiers for well-established databases. In this case, you will have to:
- Create a Wikidata entity for the identifier (if it does not already exist).
- Propose the creation of a corresponding property. Note: creating new properties requires community approval that may take several weeks.
If you have compiled an inventory of heritage institutions in your country by drawing on various sources, you may use the GLAM ID (Q25839974) and the corresponding property GLAM ID (P3066) to add unique identifiers to Wikidata. Before doing so, you should coordinate with the OpenGLAM Benchmark Survey Project that has started using the GLAM ID as unique identifier; it is important that these identifiers remain unique.
Mapping Between the Data Structure in the Source Files and the Data Structure on Wikidata
editAnother important step in view of the ingestion of data from the source databases into Wikidata is the mapping between the data structure in the source files and the data structure on Wikidata. For this purpose, you should create a sub-page of this page that is specific to your country (unless one already exists). Example: Mapping information for data sources covering Swiss heritage institutions.
The data needs to be mapped at two levels:
- Properties: For each property contained in the source data file, a corresponding property needs to be identified (or newly created) on Wikidata. If the source data is in table format, the column headers usually represent the properties, while each row typically represents one item. A list of properties commonly used in relation to heritage institutions can be found in the Data Structure section.
- Classes / controlled vocabularies: In some cases, the values of the properties may be simple strings, as for example in a physical address ("Museum Street 1"); in this case, no further mapping is needed. In other cases, the values are controlled vocabularies represented on Wikidata by specific classes or entities. In this case, the values in the source data file need to be mapped to those specific classes or entities. See the Mapping Information for the Swiss GLAM Inventory for a series of examples. An overview of controlled vocabularies commonly used in relation to heritage institutions can be found in the Data Structure section; an overview of the typology of heritage institutions currently in use on Wikidata can be found in the Typology section. Depending on the dataset and the country, it may be necessary to complement these controlled vocabularies by creating new items on Wikidata before the ingestion of the data can begin.