Saturday, March 2, 2024

Data and Databases

I've recently taken over as the Zone 2 (British Columbia, Idaho, Oregon, Washington) Coordinator for the Lepidopterists' Society Season Summary. With that has come many questions from people as I compiled my first report and have been learning the ropes. This post summarizes the public databases that include or are exclusively for records of Lepidoptera.

Simply defined, a database is a collection of data that is organized to facilitate the search and retrieval of that data. Prior to the digital age, such a database could consist of a card catalogue, or even a well-organized series of notebooks. With the advance of digital technology, the number and usage of online databases has increased over the past twenty years. In particular, when analyzing large downloads of butterfly data, I've noticed a trend of the number of modern records beginning to sharply increase around 2013, which is roughly when iNaturalist and eButterfly began to catch on.

There are four sites to which users may submit Lepidoptera records online:

BAMONA began in 1995 as two sites, Butterflies of the U.S. and Moths of the U.S., which were merged and relaunched in 2006 as Butterflies and Moths of North America. Users can submit their photos for verification or identification. Records are shown as orange dots on a Google basemap, with historic records (usually center points of counties, indicating occurrence somewhere in the county) shown as purple dots. Currently it hosts over 953,000 verified records of butterflies and moths.

BugGuide was launched in 2003 and is hosted by Iowa State University. It is structured to assist in species identification via a clickable guide, browsing photos by taxa, or by submitting photos as an "ID request". People may also submit any of their observations to the website. It currently hosts over 200,000 records of butterflies and moths, and many more of other Arthropods. Photos from verified submissions of moths on BugGuide are (with permission) sometimes copied to the Moth Photographers Group site.

iNaturalist began as a Master's project at UC Berkeley in 2008, eventually becoming an initiative of the California Academy of Sciences in 2014, and turned into an independent non-profit organization in 2023. It currently hosts over 2.5 million "research grade" records of butterflies and moths.

eButterfly also began as a graduate student's idea around 2010, forming in 2011 and initially launching in 2012 as a Canadian program, eventually expanding across North America. It is beginning to grow worldwide and currently hosts over 530,000 observations of butterflies.

Two databases compile data from most other online databases, in addition to many other sources, such as collections or smaller project-based sources. The Symbiota Collections of Arthropods Network (SCAN), as the name implies, is limited to Arthropod records only. It currently hosts over 33.7 million records for 238,000 species. The Global Biodiversity Information Facility (GBIF) contains most, if not all, of the same records fed into SCAN, but also contains data for all species, not just Arthropods. It currently hosts over 2.6 billion records. Many museum and university collections, and even some private ones, load their specimen data into these databases. The contributing collections are largely determined by which ones can obtain funding to support the digitization of their collections (it takes a lot of man-hours!).

BugGuide, iNaturalist, and eButterfly all load their "verified" or "research grade" records to SCAN and GBIF. BAMONA is not set up to do this, so any records submitted there will only be visible through that site, although researchers may submit data requests to the site.

Another way to submit public records is through The Lepidopterists' Society Season Summary ("SS"). This is an annual summary of butterfly and moth records in North America that has been in existence ever since the Society started in 1947. Any county or state records and other notable observations are printed in the annual summary that is sent to members, while all submitted data, printed or not, is curated and loaded into SCAN (some has also been loaded to BAMONA). It is up to you whether you want to submit a full report of all your observations for the year, or only a selection. The Zone Coordinator will select which records will be printed. Observations submitted to other databases (iNaturalist, BAMONA, etc.) can also be submitted to the SS, but to avoid an abundance of duplicate records in SCAN, the Coordinator will only submit a selection of these records. For example, if an observation on iNaturalist is determined to be a new state or county record, or is a species that hasn't been seen in a particular location for a long time, it can go into the SS to make it visible to a wider audience and document the special record. 

Some other items to note about submissions:

  • You do not have to be a member of LepSoc to submit records to the SS.
  • Any records submitted are not private: they will be loaded to SCAN, from which anyone can search, view, and download data.
  • There is a delay in uploading to SCAN: currently records up to about 2020 are loaded.
  • The Zone Coordinator can give you their preferred methods for submitting data to them. It is highly recommended to use an Excel spreadsheet, and a preformatted file can be downloaded from the LepSoc SS page (linked above). I have a slightly different spreadsheet format I like to use for sorting and curating data, and can email that to you upon request.
  • You do not have to submit any photos with your records, but the Coordinator may request to see a photo of a particular record to verify the correct identification.
If you wish to share your data publicly but aren't sure which site is right for you, ask yourself some questions:
  • Do you only photograph butterflies and often make multiple observations along a path or general area? eButterfly might be the site for you.
  • Do you wish to report moths (with or without butterflies)? iNaturalist or BAMONA are probably for you.
  • Do you enjoy the ability to interact with multiple experts and peers commenting on your observations, but don't mind if some of your posts don't get consistently verified? Try iNaturalist (eButterfly has started doing something similar as well).
  • Do you like to report species in addition to Lepidoptera? Try iNaturalist or BugGuide.
  • If you don't want to take the time to post individual observations with photos, and prefer to type up all your records in Excel, try the SS.
  • Do you want to submit to the SS but also like to post on one of the other sites? That's okay! Just recognize that the Coordinator will only use any that are particularly notable, but you don't have to make that decision. Simply put a link to each of your records into a Comments column when you type up your records to send to the Coordinator, that way they can easily verify your records and select which ones should go into the SS.

If you have any questions about the SS, please feel free to contact me; my email is in the left panel of this website.

No comments:

Post a Comment