Skip to content

An overview of cloud computing in the petabyte era of seismology

Tags: cloud platform

Illustration of servers for computing and storage inside the outline of a cloud.

The EarthScope-operated data systems of the NSF GAGE and SAGE Facilities are migrating to cloud services. To learn more about this effort and find resources, visit earthscope.org/data/cloud

If you work with seismic data, put this recent paper on your reading list.

For those that have limited experience with seismic computing in the cloud, the review paper published in Geophysics Journal International by a group of authors led by Yiyu Ni at the University of Washington aims to provide some practical introduction to cloud tools — and their pros and cons.

Including some specific examples of workflows from published projects, the paper covers cloud storage and database options, cloud computing resource options, and the value of things like Jupyter notebook hubs and easily shareable containers.

For example, while comparing cloud computing to the limitations working locally on a personal computer presents clear contrast, comparisons to institutional HPCs are much more nuanced.

An HPC can provide similar data processing throughput, they write, but “cloud infrastructure allows [researchers] to tune the requested resources precisely to their demand”. Cloud-native data archives also allow cloud workflows to skip data download and curation steps, but long-term storage of your output in the cloud can prove expensive.

Overall, in a world of exploding data volume, the authors see some critical upside in a community shift to cloud resources:

“[T]he heavy lift of the peta-scale era for seismological data analysis is within reach. Embracing cloud and other scalable computing solutions transforms this burden into a catalyst. It allows researchers to interrogate Earth processes at resolutions and timescales that were previously out of reach, opening new frontiers of discovery. It also permits researchers to focus on fundamental physical methods, rather than being limited by a given observational period and spatial extent.”

You can access the paper here.