S03 - Biodiversity Heritage Library: Strategies for Improving Research Efficiency and Delivering Biodiversity Data through Digital Library Collections

Session Type: Symposium
Full Title: S03 - Biodiversity Heritage Library: Strategies for Improving Research Efficiency and Delivering Biodiversity Data through Digital Library Collections
Short Title: BHL Stragegies for Efficiency and Delivery
Organizer(s): Carolyn Sheffield, Biodiversity Heritage Library, Smithsonian, Washington DC
Contributors: Constance Rinaldo
  Grace Costantino
  Siobhan Leachman

Unsolicited contributions considered? No


Biodiversity literature and archival collections serve as lynchpin data sources for understanding the vast specimen collections of natural history museums and botanical gardens by documenting specimen location data, the context in which they were collected, and generally serving as a treasured source of the knowledge that has already been gained from studying them. As an international consortium, the Biodiversity Heritage Library (BHL) participates in the larger biodiversity community as both a provider of rich, open access collections and a consumer of biodiversity data. BHL contains over 53 million pages of biodiversity literature and archives, including over 180 million instances of taxonomic names, along with species descriptions, traits, and other related data.

In addition to published literature, BHL has recently embarked on several initiatives to digitize field notes. Information found in biodiversity literature and field notes are crucial not just for identifying new species but also identifying which species to protect in conservation efforts, focusing those efforts based on historic distribution records, and learning from past extinctions. In fact, such literature and archival collections sometimes contain the only record of specimens that have been destroyed.

As a consumer of biodiversity data, BHL uses tools and services powered by Global Names Architecture (GNA) to make the taxonomic names in the pages of the literature searchable. These indexed names are then interlinked with the Encyclopedia of Life (EOL) and Global Biodiversity Information Facility (GBIF), connecting literature references with curated species profiles. BHL also consumes data thanks to the efforts of countless volunteers transcribing field notes, correcting OCR (optical character recognition), and assigning taxonomic tags.

Based on user studies in 2017, two clear priorities have been set for technical development in 2018. First is to launch full text searching, which will better enable extraction of specimen information as well as geographic locations and common names. Next is to enhance the extractability of unique data in field notes by incorporating crowdsourced transcriptions of handwritten notes to better enable indexing and searching. As BHL enhances available tools, specimen, trait, name and location data will be more accessible and easier to connect.

This session will cover BHL’s recent initiatives to digitize field notes; contributions of volunteers to link specimen records to literature and incorporate BHL data and collections into external systems; strategies to improve discoverability of field notes and other materials through crowdsourcing initiatives; and current goals, challenges, and approaches for our technical development priorities for improving research methodology.