Data Reduction

Volumes of data being generated by simulations and next generation observation platforms are huge, but not all the data being produced needs to be stored as is. We are investigating techniques to identify data that does not need to be stored (and conversely identify data that is important), and to compress data when it does need to be stored.

We ourselves are directly investigating:

The use of deep learning to identify important data features which need to be stored.
The use of regridding to reduce data volumes
The use of fabric technology (e.g. network cards) to do compression
The use of metadata standards to ensure scientists have faith in the provenance of compressed data.

People

Bryan Lawrence

Professor of Weather and Climate Computing

Daniel Galea

Former Ph.D Student

Computational Modelling Services (NCAS)

CMS colleagues work on our projects as necessary.