Volumes of data being generated by simulations and next generation observation platforms
are huge, but not all the data being produced needs to be stored as is. We are
investigating techniques to identify data that does not need to be stored (and conversely identify data that is important), and to compress data when it does need to be stored.
We ourselves are directly investigating:
- The use of deep learning to identify important data features which need to be stored.
- The use of regridding to reduce data volumes
- The use of fabric technology (e.g. network cards) to do compression
- The use of metadata standards to ensure scientists have faith in the provenance of compressed data.
Professor of Weather and Climate Computing
CMS colleagues work on our projects as necessary.