Online Laboratory for Data Compression in Climate Science and Meteorology

Date:

Tyree, J., Faghih-Naini, S., Dueben, P., Peters-von Gehlen, K., Järvinen, H. J. (20.11.2024) Online Laboratory for Data Compression in Climate Science and Meteorology. Finnish National Hydrological and Climate Modelling Seminar. Available from: doi:10.5281/zenodo.14191942.

Abstract

While the output volumes from high-resolution weather and climate models are increasing exponentially, data storage, access, and analysis methods have not kept up. Data compression is a vital tool to keep up with this increase in data production. As lossless compression is however no longer sufficient to produce the required compression ratios, lossy compression should be applied instead.

Information loss sounds scary. While mounting research shows that model and measurement data contains “false information” (e.g. noise or uncertainty from measurements or numerical inaccuracies) that can be removed for better compression without degrading the data quality, a convincing argument for lossy data compression can only be made by domain scientists themselves.

As part of the EuroHPC ESiWACE, Phase 3, Centre of Excellence (https://www.esiwace.eu/), we have been developing an online compression laboratory that showcases different compression methods and allows scientists to try them out on their own data to observe how their scientific analyses would be affected. Crucially, the online laboratory is accessible as a normal website that hosts a JupyterLab-like environment and requires no user-side installations.

We aim to showcase the online laboratory to a broad audience of domain scientists to gather their needs and concerns about lossy data compression to direct our future research and develop recommendations for safe scientific lossy data compression.