Wikidata:WikiProject Performing arts/Data sources

 

Home

 

Data Structure

 

Typologies

 

Data Sources

 

Use Cases

 

Tools & Tasks

 

Statistics

 

Data Sources - Performing Arts Productions edit

The table below contains an overview over the data sources to import data about performing arts productions from (listed by country). Try to be as comprehensive as possible when listing potential data sources of a given country.

Country Publisher Database Coverage Wikidata item Mapping information Rights Contact Comments Ingestion Status
BE Flanders Arts Institute Database of the Flanders Arts Institute - Productions Performing arts productions (co-)produced by Flemish organisations since 1993 CC-0 User:Beireke1 Data about people, organisations, premiere date and location first need to be uploaded to Wikidata as they are part of the metadata of productions. Ingest complete since December 2018
CH Bern University of the Arts Database of the Ehrenreich Collection Inventory of the Ehrenreich Collection, comprising the metadata of several thousand opera performances and concerts Mapping File (Ehrenreich)

initial draft; needs revising!

CC-0 User:Beat Estermann In the process of data cleansing (spring 2018)
CH Swiss Theatre Collection Swiss Theatre Metadata - Professional Theatre Productions Inventory of approx. 55'000 professional theatre productions in Switzerland, mostly from the second half of the 20th century, for some theatres reaching further back in time Inventory of Professional Theatre Productions in Switzerland (Q50920196) Mapping File

(Switzerland)

CC-0 User:Beat Estermann Data about works and contributors have not been linked to authority files yet (designations and names are just stored as strings). In the process of data cleansing (2017-2018);

small pilot ingest (2018)

CH Swiss Theatre Collection to be published Inventory of amateur theatre productions in Switzerland, from the second half of the 20th century. User:Beat Estermann Data about works and contributors have not been linked to authority files yet (designations and names are just stored as strings). Ingest postponed
CH Zurich Municipal Archive Repertoire des Schauspielhauses Zürich, 1938-1968 Inventory of theatre productions at Schauspielhaus Zürich between 1938-1968 Repertoire of Schauspielhaus Zurich, 1938-1968 (Q39907533) Mapping File

(Switzerland)

CC-0 opendata@zuerich.ch At the difference to the datasets from the Swiss Theatre Collection, this datasets contains information about which actor played which role. Ingest complete

(query)

Data Sources - Performance History Databases edit

The table below contains an overview of performance history databases (listed by country). At the difference to the table above, these databases contain data about individual performances (as opposed to productions consisting of several performances). Try to be as comprehensive as possible when listing potential data sources of a given country.

Country Publisher Database Coverage Wikidata item Mapping information Rights Contact Comments Ingestion Status
CH Montreux Jazz Archives Montreux Jazz Festival Database (Q99181182) The dataset covers all the music performances in the context of the Montreux Jazz Festival. 4628 items ingested[1]
US Carnegie Hall Carnegie Hall Performance History The published data set covers nearly 45,000 performance events (music, dance, theatre), civic meetings, debates, lectures, film screenings, as well as corresponding records for more than 100,000 artists, 20,000 composers and over 85,000 musical works. Mapping File (Carnegie Hall) CC-0 Rob Hudson, Manager Archives, rhudson@carnegiehall.org Ingest originally planned for spring 2018, postponed

Data Sources - Corporate Bodies Related to the Performing Arts edit

The table below contains an overview over the data sources to import data about corporate bodies related to the performing arts (e.g. performing arts companies, theatre organizations, etc.) from (listed by country). Try to be as comprehensive as possible when listing potential data sources of a given country.

Country Publisher Database Coverage Wikidata item Mapping information Rights Contact Comments Ingestion Status
BE Flanders Arts Institute Flanders Arts Institute - Organisations Organisations (co-)producing professional performing arts (co-)produced in Flanders since 1993. CC-0 User:Beireke1 Ingest complete since December 2018
CH Swiss Theatre Collection Swiss Theatre Metadata - Professional Performing Arts Companies Inventory of all known professional performing arts companies in Switzerland CC-0 User:Beat Estermann Data about the members (performing arts professionals) have not been linked to authority files yet (names are just stored as strings). Largely ingested; some errors need to be corrected (see case report)
CH Swiss Theatre Collection Swiss Theatre Metadata - Puppet Theatre Companies Inventory of all known puppet theatre companies in Switzerland CC-0 User:Beat Estermann Data about the members (performing arts professionals) have not been linked to authority files yet (names are just stored as strings).
CH Swiss Theatre Collection Swiss Theatre Metadata - Amateur Theatre Companies Inventory of amateur theatre companies in Switzerland CC-0 User:Beat Estermann
CH Swiss Theatre Collection Swiss Theatre Metadata - Circuses Inventory of circuses in Switzerland CC-0 User:Beat Estermann

Data Sources - Character Roles edit

The table below contains an overview over the data sources to import data from about character roles related to the performing arts (listed by country). Try to be as comprehensive as possible when listing potential data sources of a given country.

Tool: Role Extractor Script – used to extract data about roles from Wikipedia articles about plays (operas, musicals, theatre plays, etc.), to match the data against existing data in Wikidata, and to produce QuickStatements code to add new items to Wikidata and to complement existing ones.

Country Publisher Database Coverage Wikidata item Mapping information Rights Contact Comments Ingestion Status
n/a Wikipedia Data compiled from templates and tables contained in WP articles about operatic works character roles with indication of voice type and premiere cast for 1000-2000 operatic works n/a Mapping file

(Character roles)

CC-0 User:Beat Estermann Ingested in June 2018:
  • Data from WP-de (character roles and voice types)

Ingested in May 2019

(by User:Arizcraf):

  • Data from WP-en (character roles and voice types)
  • Data from WP-fr (character roles and voice types)
n/a Wikipedia Data compiled from templates and tables contained in WP articles about theatrical plays. character roles for 2000-3000 theatre plays (rough estimate; sometimes with the indication of the original cast). n/a same mapping as for opera characters (but without voice type) CC-0 User:Beat Estermann Ingested in May 2019

(by User:Arizcraf):

  • Data from WP-en (character roles)
n/a Wikipedia Data compiled from templates and tables contained in WP articles about musicals character roles for musicals, sometimes with information regarding the premiere cast and the voice type n/a same mapping as for opera characters CC-0 User:Beat Estermann Currently no ingest scheduled.

Data Sources - Works edit

The table below contains an overview over the data sources to import data from about works related to the performing arts (listed by country). Try to be as comprehensive as possible when listing potential data sources of a given country.

Country Publisher Database Coverage Wikidata item Mapping information Rights Contact Comments Ingestion Status
US Robert Glaubitz aria-database.com Contains data about 1288 Arias from 177 Operas by 65 Composers. Since 2013, the database is no longer enhanced. Does most likely not contain much data that is not already available through the Opera Database (needs checking). Unclear whether it is worthy to be ingested.
US Lee Steiner The Opera Database Contains data about several thousand operas and almost 4000 arias. The Opera Database (Q54366466) Mapping file

(Operas and arias)

no copyright (data) User:Beat Estermann Ingested in June 2018:
  • operas (without the links to IMSLP, synopsis and libretto)
  • arias (without the links to operas not already pre-existing on Wikidata and without the links to PDFs or reprints)
US Stanford University OperaGlass Contains an index of over 4800 opera composers with opera lists (more than 25'000 operas) no copyright (data) Varying data quality and structure.
US Opera America Inc North American Works Directory Contains data about more than 1200 North American opera and music-theater works. no copyright (data)
UK Boosey & Hawkes Boosey & Hawkes Opera Index Contains data about approx. 500 operas (focus on 20th and 21st century)
US/CA Project Petrucci LLC IMSLP Contains data about musical works.
CA LiederNet Archive lieder.net Contains data about art songs (lieder, chansons, etc.) by more than 16'000 composers and more than 15'000 text authors. no copyright (data); some of the song texts are under copyright User:Beat Estermann Ingest is being prepared (June 2020)

Describe the Data Sources on Wikidata edit

Databases suitable for data ingestion into Wikidata should be described on Wikidata itself. For each database, an item needs to be created on Wikidata, so that the database can be cited as a source when ingesting data into Wikidata. Refer to Help:Sources for information about how to use sources in Wikidata. When it comes to importing data about heritage institutions, the types of data sources most commonly found are:

  • Online databases: In the case of an online database, a Wikidata property should be created that corresponds to a unique identifier used to refer to items in the database. For the database itself, a Wikidata item should be created. Refer to Help:Sources#Databases for further guidance. Example: <insert an example of an online database here.>.Note: In fall 2016, it was impossible to properly reference online databases using the Quick Statements Tool, a tool commonly used to batch ingest data into Wikidata. As of December 2016, a new version of the tool is under development. <Check the next release of the tool and update this note if necessary.>
  • Database dumps / database exports: In the case of a database dump or a database export, the source file is typically available in a spreadsheet format (e.g. CSV or Excel) or in a hierarchical format (e.g. XML). In this case, a Wikidata item should be created for the database itself (example: Swiss GLAM Inventory (Q26933296)) and for the specific export file (example: Swiss GLAM Inventory, 16 September 2016 (Q27477970)).
  • Simple web pages: In some cases you may find relevant lists on simple web pages. If the source is a simple website, it does not need to be described as a separate Wikidata item. Refer to Help:Sources#Web page for further information about how to source statements to simple web pages.

Unique Identifiers edit

Before ingesting data into Wikidata, we usually want to make sure that our source database contains a unique identifier which can later be used to match the data on Wikidata with the data in the source database. This is particularly useful in the case of future updates to the source database. There are two commonly used approaches to ingesting such unique identifiers into Wikidata:

With regard to the data sources concerning the performing arts, it is recommended to use single-source identifiers for well-established databases. In this case, you will have to:

  1. Create a Wikidata entity for the identifier (if it does not already exist).
  1. Propose the creation of a corresponding property. Note: creating new properties requires community approval that may take several weeks.

Mapping Between the Data Structure in the Source Files and the Data Structure on Wikidata edit

Another important step in view of the ingestion of data from the source databases into Wikidata is the mapping between the data structure in the source files and the data structure on Wikidata. For this purpose, you should create a sub-page of this page that is specific to your country (unless one already exists). Example: Mapping information for data sources covering Swiss heritage institutions.

The data needs to be mapped at two levels:

  • Properties: For each property contained in the source data file, a corresponding property needs to be identified (or newly created) on Wikidata. If the source data is in table format, the column headers usually represent the properties, while each row typically represents one item. A list of properties commonly used in relation to the performing arts can be found in the Data Structure section.
  • Classes / controlled vocabularies: In some cases, the values of the properties may be simple strings, as for example in a physical address ("Museum Street 1"); in this case, no further mapping is needed. In other cases, the values are controlled vocabularies represented on Wikidata by specific classes or entities. In this case, the values in the source data file need to be mapped to those specific classes or entities. See the Mapping Information for the Swiss GLAM Inventory for a series of examples. An overview of controlled vocabularies commonly used in relation to the performing arts can be found in the Data Structure section; an overview of performing arts related typologies currently in use on Wikidata can be found in the Typology section. Depending on the dataset, it may be necessary to complement these controlled vocabularies by creating new items on Wikidata before the ingestion of the data can begin.

References edit

  1. {{sparql|query=SELECT ?item ?itemLabel WHERE {   ?item wdt:P361 wd:Q669118.   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } } }}