SSH variability

Description

The sshVariability diagnostic is a part of AQUA framework’s frontier diagnostic. It calculates the sea surface height (SSH) standard deviation for models (e.g. FESOM, ICON, NEMO) and compares them against the AVISO model. This diagnotic can work on Healpix and standard Lat-Lon grid data. It also provides visualization of the SSH variability for the models. SSH variability provides insights into the complex dynamics of the ocean. It represents the changes in sea surface height over time, which can be influenced by various factors such as ocean currents, wind patterns, tides, and interactions with the atmosphere. By studying SSH variability, we can gain a better understanding of oceanic processes and their impact on climate. High-resolution climate models simulate fine-scale variations in SSH, capturing small-scale features and regional differences highly relevant in the context of climate adaptation for instance, coastal management such as managing coastal hazards like flooding or storm surges.

Classes

There are two main classes in this diagnotic namely, sshVariabilityCompute and sshVariabilityPlot.

  • sshVariabilityCompute: class to compute the ssh variability. It retrieves the data on it original grid with an option of regridding the data on a different resolution. Then the ssh standard deviation (point-wise) is computed along the given time interval. If on time interval is provided, standard deviation will be perfromed over the whole domain. Then the data is stored in a netcdf file using the AQUA OutputSaver class.

  • sshVariabilityPlot: class to plot the sshVariability. Once the standard deviation is performed, it can be passed to this class for plotting. This class plots the standard deviation for the given model and the reference AVISO data. It can also plot the difference between AVISO and the model standard deviation. This class also provides a functionality to plot selected region and the difference plots of the region. The user may as well choose the resoution on which they would like to plot the data.

  • BaseMixin: this class is called inside the sshVariabilityCompute class. This class basically retrieves the data using the Reader class in AQUA core and provides the functionality to save the output as netcdf file.

  • PlotBaseMixin: this class is called inside the sshVariabilityPlot class. It mainly provides the functionality to save the plots as PNG and PDF.

File Structure

  • The diagnostic is located in src/aqua_diagnostics/sshVariability directory, which contains both the source code and the command line interface (CLI) script.

  • The configuration file for the CLI is located in config/diagnostics/sshVariability directory with default options.

  • A notebook is avaliable in the notebooks/diagnostics/sshVariability/sshVariability.ipynb directory with an example for using this diagnostic.

  • README.md : a readme file which contains technical information on how to install the SSH diagnostic and its environment and, the version of the diagnostic.

Input variables and datasets

By default, the diagnostic compares against the AVISO dataset but can be configured to use any other dataset as a reference. zos or avg_zos is the variable which is used in this diagnostic. The output (netcdf, PNG and PDF) is stored using the OutputSaver class in both BaseMixin and PlotBaseMixin classes.

The diagnostic is designed to work with both the data from the Low Resolution Archive (LRA) and the original high resolution Healpix data. The LRA is generated by the Data reduction OPerator (DROP) of the AQUA project, which provides monthly data at a 1x1 degree resolution.

Basic Usage

The basic usage of this diagnostic is explained with a working example in the notebook provided in the notebooks/diagnostics/sshVariability directory. The basic structure of the analysis is the following:

Example usage

from aqua.diagnostics import sshVariabilityCompute, sshVariabilityPlot

# You can name these dictionaries as you like
dataset_dict = {
    "catalog": "climatedt-phase1",
    "model": "IFS-NEMO",
    "exp": "historical-1990",
    "source": "ssh-IFS-NEMO-test",
    "regrid": "r025",
}

dataset_dict_ref = {
    "catalog": "obs",
    "model": "AVISO",
    "exp": "ssh-L4",
    "source": "ssh-AVISO-test",
    "regrid": "r025",
}

startdate = "1994-01-01"
enddate = "1994-01-04"

# Initialize the SSH compute class
ssh_dataset = sshVariabilityCompute(
    **dataset_dict,
    var="zos",
    startdate=startdate,
    enddate=enddate,
)

# Run the compute function and save as NetCDF
ssh_dataset.run()

# Initialize the SSH compute class for reference data (AVISO)
ssh_dataset_ref = sshVariabilityCompute(
    **dataset_dict_ref,
    var="zos",
    startdate=startdate,
    enddate=enddate,
)

# Run the compute function and save as NetCDF
ssh_dataset_ref.run()

# Initialize the SSH plot class
plot_class = sshVariabilityPlot()

# Plot SSH for model dataset
plot_dataset = {"catalog": "climatedt-phase1", "model": "IFS-NEMO", "exp": "historical-1990"}
plot_class.plot(
    dataset_std=ssh_dataset.data_std,
    **plot_dataset,
    startdate=startdate,
    enddate=enddate,
)

# Plot SSH for reference dataset
plot_dataset_ref = {"catalog": "obs", "model": "AVISO", "exp": "ssh-L4"}
plot_class.plot(
    dataset_std=ssh_dataset_ref.data_std,
    **plot_dataset_ref,
    startdate=startdate,
    enddate=enddate,
)

# Plot the diference of sub region for model dataset and reference dataset AVISO
time_intervals = {
    "startdate": "1994-01-01",
    "enddate": "1994-01-04",
    "startdate_ref": "1994-01-01",
    "enddate_ref": "1994-01-04",
}

region_selection = {
    "region": "Agulhas",
    "lon_limits": [5, 50],
    "lat_limits": [-10, -50],
    "proj": "plate_carree",
    "proj_params": {},
    "tgt_grid_name": "r3600x1800"
}

_dataset_ref = {
    "catalog_ref": "obs",
    "model_ref": "AVISO",
    "exp_ref":"ssh-L4",
}

_dataset = {
    "catalog": "climatedt-phase1",
    "model": "IFS-NEMO",
    "exp":"historical-1990",
}

plot_class.plot_diff(
    dataset_std=ssh_dataset.data_std,
    dataset_std_ref=ssh_dataset_ref.data_std,
    **_dataset,
    **_dataset_ref,
    **region_selection,
    **time_intervals
)

Note

The user can also define the start and end date of the analysis and the reference dataset. If not specified otherwise, plots will be saved in PNG and PDF format in the current working directory.

CLI usage

The diagnostic can be run from the command line interface (CLI) by running the following command:

cd $AQUA/src/aqua_diagnostics/sshVariability
python cli_sshVariability.py --config_file <path_to_config_file>

Additionally, the CLI can be run with the following optional arguments:

  • --config, -c: Path to the configuration file.

  • --nworkers, -n: Number of workers to use for parallel processing.

  • --cluster: Cluster to use for parallel processing. By default a local cluster is used.

  • --loglevel, -l: Logging level. Default is WARNING.

  • --catalog: Catalog to use for the analysis. Can be defined in the config file.

  • --model: Model to analyse. Can be defined in the config file.

  • --exp: Experiment to analyse. Can be defined in the config file.

  • --source: Source to analyse. Can be defined in the config file.

  • --outputdir: Output directory for the plots.

Config file structure

The configuration file is a YAML file that contains the details on the dataset to analyse or use as reference, the output directory and the diagnostic settings. Most of the settings are common to all the diagnostics (see Diagnostics configuration files). Here we describe only the specific settings for the sshVariability diagnostic.

  • sshVariability: a block (nested in the diagnostics block) containing options for the SSH Variability diagnostic. Variable-specific parameters override the defaults.

    • run: enable/disable the diagnostic.

    • diagnostic_name: name of the diagnostic. sshVariability by default.

    • variables: list of variables to analyse. In sshVariability this variable is zos or avg_zos.

    • startdate_data / enddate_data: time range for the dataset.

    • startdate_ref / enddate_ref: time range for the reference dataset.

diagnostics:
    sshVariability:
    run: true
    diagnostic_name: 'sshVariability'
    variables: 'zos'
    params:
        default:
            startdate_data: '1994-01-01'
            enddate_data: '1994-01-04'
            startdate_ref: '1994-01-01'
            enddate_ref: '1994-01-04'
  • plot_params: defines colorbar palette and limits and projection parameters. The default parameters are used if not specified. Refer to ‘src/aqua/util/projections.py’ for available projections. Note that the plots can be stored on the original resolution or the data can be regridded to another resolution for a quick plot. The default for plotting regrid variable tgt_grid_name: 'r360x180' with the regridding method regrid_method: 'ycon'. More options for regridding are documented on the topic of Regridding in AQUA <https://aqua.readthedocs.io/en/latest/regrid.html>_

plot_params:
    default:
        projection: 'robinson'
        projection_params: {}
        vmin:
        vmax:
        cmap: 'RdBu_r'
        tgt_grid_name: 'r360x180'
        regrid_method: 'ycon'
    # sub region selection
    sub_region :
        name: Agulhas
        lon_limits: [5, 50]
        lat_limits: [-10, -50]
        projection: 'plate_carree'
        projection_params: {}
    # ONLY FOR ICON: Flags for northern and southern boundaries to mask out specific latitudes.
    # As AVISO does not have data under the sea ice, which ICON does,
    # to make the datasets comparable - SSH under sea ice for ICON can be masked out.
    mask_options:
        mask_northern_boundary: true
        mask_southern_boundary: true
        northern_boundary_latitude: 70
        southern_boundary_latitude: -62

Output

The diagnostic produces four types of plots:

  • Global SSH variability plots for the given model and the reference.

  • Global difference plot (model vs reference)

  • Regional SSH variability plots for the given model and the reference.

  • Regional difference plot (model vs reference)

Plots are saved in both PDF and PNG format.

Observations

The default reference dataset is from AVISO Sea Surface Height Data, but custom references can be configured.

References

  • Copernicus Climate Change Service, Climate Data Store, (2018): Sea level gridded data from satellite observations for the global ocean from 1993 to present. Copernicus Climate Change Service (C3S) Climate Data Store (CDS). DOI: 10.24381/cds.4c328c78 (Accessed on 01-Mar-2023)

Example Plot(s)

../_images/sshVariability.sshVariability.climatedt-phase1.IFS-NEMO.historical-1990.r1.png

SSH Variability for IFS-NEMO historical-1990.

../_images/sshVariability.sshVariability.obs.AVISO.ssh-L4.r1.png

SSH Variability for AVISO data.

../_images/sshVariability.sshVariability.climatedt-phase1.IFS-NEMO.historical-1990.r1.agulhas.png

SSH Variability for IFS-NEMO in Agulhas region.

../_images/sshVariability.sshVariability.obs.AVISO.ssh-L4.r1.agulhas.png

SSH Variability for AVISO in Agulhas region.

../_images/sshVariability.sshVariability_Difference.climatedt-phase1.IFS-NEMO.historical-1990.r1.obs.AVISO.ssh-L4.agulhas.png

SSH Variability difference between IFS-NEMO and AVISO in Agulhas region.

Available demo notebooks

Notebooks are stored in the notebooks/diagnostics/sshVariability directory and contain usage examples.

Authors and contributors

This diagnostic is authored and maintained by Maqsood Mubarak Rajput (@maqsoodrajput, maqsoodmubarak.rajput@awi.de). Contributions are welcome — please open an issue or a pull request. For questions or suggestions, contact the AQUA team or the maintainers.

Detailed API

This section provides a detailed reference for the Application Programming Interface (API) of the sshVariability diagnostic, produced from the diagnostic function docstrings.

ssh module

class aqua.diagnostics.sshVariability.SshVariabilityCompute(diagnostic_name: str = 'sshVariability', catalog: str = None, model: str = None, exp: str = None, source: str = None, startdate: str = None, enddate: str = None, freq: str = None, region: str = None, regrid: str = None, lon_limits: list[float] = None, lat_limits: list[float] = None, var: str = 'zos', long_name: str = None, short_name: str = None, units: str = None, save_netcdf: bool = True, rebuild: bool = True, outputdir: str = './', reader_kwargs: dict = {}, loglevel: str = 'WARNING')

Bases: BaseMixin

SSH Computation

Initialize the ‘SshVariabilityCompute’ class.

This class is designed to load an xarray.Dataset and computes STD.

Parameters:
  • diagnostic_name (str) – Default is ‘sshVariability’.

  • catalog (str) – catalog. It is Mandatory, if ‘save_netcdf=True’.

  • model (str) – Name of the data

  • exp (str) – Name of the experiment

  • source (str) – the source. It is important to give these dates and input. Otherwise the whole dataset is retrieved.

  • startdate (str) – Start date.

  • enddate (str) – End date.

  • freq (str) – Frequency of the data. In the TODO list. This becomes important when implementing the ‘variance of the variances formula’.

  • region (str) – For subregion selection. Default is ‘None’. In case of sub-region STD computation, this variable is mandatory.

  • regrid (str) – Regrid option for the data. NOTE: the regridding will be applied before computing the STD.

  • None (If 'lon_limits' and 'lat_limits' are)

  • AQUA. (they are taken from region file in)

  • lon_limits (list[float]) – list of lon limits. Default is ‘None’.

  • lat_limits (list[float]) – list of lat limits. Default is ‘None’.

  • var (str) – Variable name for ssh data. Default is ‘zos’.

  • long_name (str) – If not given extracted from the data.

  • short_name (str) – If not given extracted from the data.

  • units (str) – If not given extracted from the data.

  • save_netcdf (bool) – Default is ‘True’.

  • rebuild (bool) – Recomputes and saves the netcdf. Default is “True”.

  • outputdir (str) – output directory. Default is ‘./’

  • loglevel (str) – Default WARNING.

Keyword Arguments:
  • zoom (int, optional) – HEALPix grid zoom level (e.g. zoom=10 is h1024). Allows for multiple gridname definitions.

  • realization (int, optional) – The ensemble realization number, included in the output filename.

  • **kwargs – Additional arbitrary keyword arguments to be passed as additional parameters to the intake catalog entry

run()
Parameters:

create_catalog_entry (bool) – Option for creating catalog entry. Default is ‘False’.

This function performs following three functions: a) Retrieve data and regrid if given then b) Compute STD c) Save netcdf

class aqua.diagnostics.sshVariability.SshVariabilityPlot(diagnostic_name='sshVariability', outputdir='./', loglevel='WARNING')

Bases: PlotBaseMixin

Plot sshVariability and the difference of sshVariability

Initialize the sshVariability.

Parameters:
  • diagnostic_name (str) – sshVariability

  • outputdir (str) – output directory

  • loglevel (str) – Default WARNING

plot(var=None, dataset_std=None, catalog=None, model=None, exp=None, startdate=None, enddate=None, plot_options={}, figsize: tuple = (11, 8.5), ax_pos: tuple = (1, 1, 1), vmin=None, vmax=None, gridlines=True, proj='robinson', proj_params={}, save_format=['png', 'pdf', 'svg'], dpi=600, region=None, lon_limits=None, lat_limits=None, mask_options={}, mask_northern_boundary=True, mask_southern_boundary=True, northern_boundary_latitude=70, southern_boundary_latitude=-62, diagnostic_product='sshVariability', rebuild: bool = True, description=None, tgt_grid_name='r1440x721', regrid_method='ycon')

Visualize the SSH variability.

Plot the variability of sea surface height (SSH) from an input dataset.

This function visualizes SSH variability using configurable spatial, temporal, and plotting options. It supports contou, regional selection, custom projections, masking, and output saving in multiple formats.

Parameters:
  • var (str, optional) – Variable name for SSH, e.g., 'zos'.

  • dataset_std (xarray.Dataset, optional) – Dataset containing the SSH field to be plotted.

  • catalog (str, optional) – Catalog name. Used in plot titles. (Mandatory for labeling)

  • model (str, optional) – Model or dataset name. Used in plot titles. (Mandatory for labeling)

  • exp (str, optional) – Experiment identifier. Used in plot titles. (Mandatory for labeling)

  • startdate (str, optional) – Start date label to include in the plot title.

  • enddate (str, optional) – End date label to include in the plot title.

  • regrid (str or dict, optional) – Regridding option or parameters for spatial interpolation.

  • plot_options (dict, optional) – Additional keyword arguments for customizing the plot (e.g., colormap, linewidth).

  • vmin (float, optional) – Minimum value for color scaling. If None, determined automatically.

  • vmax (float, optional) – Maximum value for color scaling. If None, determined automatically.

  • proj (str, optional) – Map projection type. Default is 'robinson'.

  • proj_params (dict, optional) – Additional keyword arguments passed to the projection.

  • save_format (str or list, optional) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • dpi (int, optional) – Resolution (dots per inch) for saved figures. Default is 300.

  • region (str, optional) – Region identifier. If provided, overrides lat/lon limits.

  • lon_limits (list[float], optional) – Longitude limits [min, max] for the plot.

  • lat_limits (list[float], optional) – Latitude limits [min, max] for the plot.

  • mask_options (dict, optional) – Options for masking grid cells (specific to ICON).

  • mask_northern_boundary (bool, optional) – If True, mask latitudes north of northern_boundary_latitude.

  • mask_southern_boundary (bool, optional) – If True, mask latitudes south of southern_boundary_latitude.

  • northern_boundary_latitude (float, optional) – Latitude above which data will be masked. Default is 70.

  • southern_boundary_latitude (float, optional) – Latitude below which data will be masked. Default is -62.

  • diagnostic_product (str, optional) – Diagnostic type, e.g., 'sshVariability'. Default is 'sshVariability'.

  • rebuild (bool, optional) – If True, rebuild the data from the original files. Default is True.

  • description (str, optional) – Additional description to include in the plot or metadata.

  • tgt_grid_name (str, optional) – Target grid name for regridding. Default is ‘r1440x720’.

  • regrid_method (str, optional) – Regridding method to use. Default is ‘ycon’.

Returns:

The generated plot figure object.

Return type:

matplotlib.figure.Figure

Raises:
  • ValueError – If required arguments (e.g., catalog, model, exp) are missing.

  • TypeError – If inputs are of invalid type (e.g., dataset not an xarray.Dataset).

plot_diff(var=None, dataset_std=None, catalog=None, model=None, exp=None, startdate=None, enddate=None, dataset_std_ref=None, catalog_ref=None, model_ref=None, exp_ref=None, startdate_ref=None, enddate_ref=None, figsize: tuple = (11, 8.5), ax_pos: tuple = (1, 1, 1), plot_options={}, vmin_diff=None, vmax_diff=None, gridlines=True, proj='robinson', proj_params={}, save_format=['png', 'pdf', 'svg'], dpi=600, region=None, lon_limits=None, lat_limits=None, mask_options={}, mask_northern_boundary=True, mask_southern_boundary=True, northern_boundary_latitude=70, southern_boundary_latitude=-62, diagnostic_product='sshVariability_Difference', description=None, rebuild: bool = True, tgt_grid_name='r1440x721', regrid_method='ycon')

Visualize the difference in sea surface height (SSH) variability between a model and a reference dataset.

This function generates a map of SSH variability differences using Cartopy projections, supporting custom contour, masking, regional selection, and configurable plotting options. The plot can be saved as PNG or PDF.

Parameters:
  • var (str, optional) – Variable name to plot (e.g., ‘zos’).

  • dataset_std (xarray.Dataset, optional) – Dataset of the model to be plotted.

  • catalog (str, optional) – Catalog name for the model dataset (used in plot title).

  • model (str, optional) – Model name of the dataset (used in plot title).

  • exp (str, optional) – Experiment name of the dataset (used in plot title).

  • startdate (str, optional) – Start date of the dataset for the plot title.

  • enddate (str, optional) – End date of the dataset for the plot title.

  • dataset_std_ref (xarray.Dataset, optional) – Reference dataset for comparison.

  • catalog_ref (str, optional) – Catalog name for the reference dataset.

  • model_ref (str, optional) – Model name of the reference dataset.

  • exp_ref (str, optional) – Experiment name of the reference dataset.

  • startdate_ref (str, optional) – Start date of the reference dataset.

  • enddate_ref (str, optional) – End date of the reference dataset.

  • regrid (str or dict, optional) – Regridding method or parameters.

  • plot_options (dict, optional) – Additional keyword arguments for plotting (e.g., colormap, alpha).

  • vmin_diff (float, optional) – Minimum value for color scaling. If None, determined automatically.

  • vmax_diff (float, optional) – Maximum value for color scaling. If None, determined automatically.

  • proj (str, optional) – Map projection. Default is ‘robinson’.

  • proj_params (dict, optional) – Additional keyword arguments for the projection.

  • save_format (str or list, optional) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • dpi (int, optional) – Resolution of the saved figure. Default is 300.

  • region (str, optional) – Region identifier for the plot.

  • lon_limits (list[float], optional) – Longitude limits [min, max] for the plot.

  • lat_limits (list[float], optional) – Latitude limits [min, max] for the plot.

  • mask_options (dict, optional) – Options for masking (specific to ICON grids).

  • mask_northern_boundary (bool, optional) – Mask latitudes above northern_boundary_latitude. Default is True.

  • mask_southern_boundary (bool, optional) – Mask latitudes below southern_boundary_latitude. Default is True.

  • northern_boundary_latitude (float, optional) – Latitude above which data is masked. Default is 70.

  • southern_boundary_latitude (float, optional) – Latitude below which data is masked. Default is -62.

  • diagnostic_product (str, optional) – Diagnostic product identifier. Default is ‘sshVariability_Difference’.

  • description (str, optional) – Additional description for the plot metadata or title.

  • rebuild (bool, optional) – If True, rebuild the data from the original files. Default is True.

  • tgt_grid_name (str, optional) – Target grid name for regridding. Default is ‘r1440x720’.

  • regrid_method (str, optional) – Regridding method to use. Default is ‘ycon’.

Returns:

The generated figure object.

Return type:

matplotlib.figure.Figure

Raises:
  • ValueError – If required dataset or catalog/model/exp information is missing.

  • TypeError – If input datasets are not xarray.Datasets.

subregion_selection(data=None, model=None, exp=None, mask_northern_boundary=None, northern_boundary_latitude=None, mask_southern_boundary=None, southern_boundary_latitude=None, lon_lim=None, lat_lim=None, region_name=None)

Selecting sub-region based on lon-lat