Wikidata:WikiProject University of Virginia/SNAC

SNAC's own website displays a link to Wikipedia articles and images from Wikimedia Commons, as shown here for Yellow Wolf (Q8051857)

SNAC @ University of Virginia is a project to present SNAC (Q29861311), Social Networks and Archival Context, in Wikidata through the Wikidata property SNAC ARK ID (P3430).

About

edit

SNAC is an online project for discovering, locating, and using distributed historical records started by a collaboration of United States-based organizations.

SNAC organizing institutions include National Archives and Records Administration (Q518155), California Digital Library (Q5020447), Institute for Advanced Technology in the Humanities (Q13515691), and University of California, Berkeley School of Information (Q14684220). Many other organizations participate by developing the SNAC database.

History of SNAC and Wikidata

edit

On 12 December 2016 user:YULdigitalpreservation, a self-described Yale University Librarian, proposed that Wikidata import the identifiers of the SNAC database. The Wikidata community applied the general community norms at Wikidata:Property proposal to SNAC in a discussion at Wikidata:Property_proposal/Social_networks_and_archival_context_ID. The social context of this is that the Wikidata community collectively decides which properties to use to describe Wikidata items. Library identifiers and authority control proposals frequently pass after anyone establishes that a reputable institution is hosting the identifier, and that it seems stable, and that someone has a plan to match the identifier to Wikidata items. The discussion closed with a pass on 4 January 2017 when user:ArthurPSmith, a Wikidata community member trusted with a "property creator" designation, created SNAC ARK ID (P3430). On 10 January 2017 user:Reinheitsgebot, a bot which Wikidata and Wikimedia mainstay User:Magnus Manske operates, began to mass-add SNAC IDs to Wikidata items such as in the log for showing 2.5 hours of activity for that day. The addition of SNAC identifiers to Wikidata items meant that anyone can query Wikidata using SNAC identifiers, and anyone can travel to the SNAC database starting with Wikidata items.

 
timeline visualization of change logs for SNAC Ark ID. Note that most integration happened in the first month. Thereafter Wikidata updated single items as SNAC added new items.

Because Wikidata logs changes it is possible to determine how the Wikidata community imported SNAC IDs by looking at timeline visualization of change logs. The timeline shows that Wikidata users did the first experimental SNAC ID imports in January 2017. After these cautious trials, the community imported 10,000 total SNAC IDs at which time there was a pause. After considering the result there was an additional upload period to 50,000 SNAC IDs. After another pause there was an upload to 95,000 SNAC IDs, then a pause before a jump to 125,000 SNAC IDs, and by the end of January the trend to date has been for a series of individual SNAC ID additions to almost 130,000 total. This data ingestion pattern is common in Wikidata; users test a small data set to upload, then a more substantial set, then jump through larger sets to get most of the data. There are often odd challenging cases in the end which require human attention on individual items.

In May 2017 Wikidata community member user:Fuzheado created a Wikidata‎ item for SNAC (Q29861311). This Wikidata item is for SNAC itself as a database, whereas the property which the Wikimedia community created by consensus is a characteristic of items in Wikidata. The creation of the item is a way for the Wikidata community to describe the database itself with structured data and documents the significance of the property for the Wikidata community.

By October 2017 SNAC had changed its identifiers. Previously SNAC IDs were a mix of letters and numbers. After the change the identifiers were all numbers. This change meant that the connection between Wikidata and SNAC broke. In such cases the Wikidata community logs a discussion on the talk page of the property. The Formatter URL & ID type discussion notes who discussed and fixed the issue, including Wikidata community members user:Fuzheado and user:Pigsonthewing with SNAC team User:DinaHerbert and User:Deternitydx.

In June 2018 the SNAC team began to reflect on the relationship between SNAC and Wikidata. Discussions focused on Wikidata included questions about who in Wikidata had imported SNAC identifiers, why the Wikidata community became interested, and what the Wikidata community was doing with SNAC. Discussions focused on SNAC included describing the extent to which anyone developed SNAC using Wikidata, the extent to which Wikidata increases accessibility of SNAC resources, and what relationship the SNAC team should have with Wikidata.

Project results

edit

Changes to SNAC

edit

Two obvious changes to the SNAC database are that entries in SNAC link to English Wikipedia articles when one exists for a topic. Also, when Wikidata has identified a Wikimedia Commons image for a topic, then SNAC displays that image in its entries.

Changes to Wikidata

edit

The Wikidata community routinely collects database identifiers as a way of directing its users to more comprehensive library collections.

  • For an single example of how SNAC identifiers in Wikidata consider Yellow Wolf (Q8051857). This is a Wikidata item for a person. The SNAC Ark ID property is among the other properties for this person. The SNAC ID in Wikidata makes the connection to this person's profile on the SNAC homepage. This is an example of the basis of the relationship between SNAC and Wikidata; Wikidata matches its Q codes with the SNAC identifier.
  • The timeline visualization of change logs shows that in January 2017 the Wikidata community imported 125,000 SNAC IDs
  • SNAC IDs are especially influential for Wikidata items which have a SNAC ID but few other Wikidata properties
  • In some cases the SNAC ID is the only identifier for that Wikidata topic
  • The Wikidata community has presented its collaboration with SNAC as a case study.

Thanks

edit
team SNAC
team wiki
edit