API Reference

This section provides a detailed reference for AQUA-diagnostics’ Application Programming Interface (API).

AQUA Diagnostics Package

class aqua.diagnostics.Boxplots(catalog: str = None, model: str = None, exp: str = None, source: str = None, var: str | list[str] = None, startdate: str = None, enddate: str = None, regrid: str = None, diagnostic: str = 'boxplots', save_netcdf: bool = False, outputdir: str = './', loglevel: str = 'WARNING')

Bases: Diagnostic

Class for computing and plotting boxplots of field means from climate model datasets. This class retrieves data from specified datasets, computes field means for given variables, and optionally saves the results to NetCDF files. :param catalog: Catalog name. :type catalog: str :param model: Model name. :type model: str :param exp: Experiment name. :type exp: str :param source: Data source. :type source: str :param var: Variable(s) to retrieve. Defaults to None. :type var: str or list of str, optional :param startdate: Start date for data retrieval. Defaults to None. :type startdate: str, optional :param enddate: End date for data retrieval. Defaults to None. :type enddate: str, optional :param regrid: Target grid for regridding. If None, no regridding. :type regrid: str :param diagnostic: Name of the diagnostic. :type diagnostic: str :param save_netcdf: Whether to save results as NetCDF files. Defaults to False. :type save_netcdf: bool, optional :param outputdir: Directory to save output files. Defaults to ‘./’. :type outputdir: str, optional :param loglevel: Logging level. Defaults to ‘WARNING’. :type loglevel: str, optional

Initialize the diagnostic class. This is a general purpose class that can be used by the diagnostic classes to retrieve data from a single model and to save the data to a netcdf file. It is not a working diagnostic class by itself.

Parameters:
  • model (str) – The model to be used.

  • exp (str) – The experiment to be used.

  • source (str) – The source to be used.

  • catalog (str) – The catalog to be used. If None, the catalog will be determined by the Reader.

  • regrid (str | None) – The target grid to be used for regridding. If None, no regridding will be done.

  • startdate (str | None) – The start date of the plot/analysis period. If None, all available data will be used.

  • enddate (str | None) – The end date of the plot/analysis period. If None, all available data will be used.

  • std_startdate (str | None) – The start date of the standard deviation period. If None, no std period is tracked at the Diagnostic level.

  • std_enddate (str | None) – The end date of the standard deviation period. If None, no std period is tracked at the Diagnostic level.

  • loglevel (str) – The log level to be used. Default is ‘WARNING’.

MINIMUM_MONTHS_REQUIRED = 2
run(var: str = None, save_netcdf: bool = False, units: str = None, reader_kwargs: dict = {}) None

Retrieve and preprocess dataset, selecting pressure level and/or converting units if needed.

Parameters:
  • var (str or list of str, optional) – list of variables to retrieve. If None, uses self.var.

  • save_netcdf (bool, optional) – If True, saves output fldmeans as netcdf file. Defaults to False.

  • units (str or list of str, optional) – Target units (e.g., ‘mm/day’).

  • reader_kwargs (dict, optional) – Additional keyword arguments for the Reader.

Raises:
  • NoDataError – If variable not found in dataset.

  • KeyError – If the variable is missing from the data.

class aqua.diagnostics.ENSO(catalog: str = None, model: str = None, exp: str = None, source: str = None, regrid: str = None, startdate: str = None, enddate: str = None, configdir: str = None, definition: str = 'teleconnections-destine', loglevel: str = 'WARNING')

Bases: BaseMixin

Class for calculating the El Niño Southern Oscillation (ENSO) index. This class is used to calculate the ENSO index from a given dataset. It inherits from the BaseMixin class and implements the necessary methods to calculate the ENSO index.

Initialize the ENSO class.

Parameters:
  • catalog (str) – Catalog name.

  • model (str) – Model name.

  • exp (str) – Experiment name.

  • source (str) – Source name.

  • regrid (str) – Regrid target. Default is None.

  • startdate (str) – Start date for data retrieval. Default is None.

  • enddate (str) – End date for data retrieval. Default is None.

  • configdir (str) – Configuration directory. Default is None.

  • definition (str) – definition filename. Default is ‘teleconnections-destine’. This is used to deduce the variable name and the lat/lon for the index.

  • loglevel (str) – Logging level. Default is ‘WARNING’.

MINIMUM_MONTHS_REQUIRED = 24
compute_index(months_window: int = 3, box_brd: bool = True, rebuild: bool = False)

” Evaluate station based index for a teleconnection. Field data must be monthly gridded data.

Parameters:
  • months_window (int, opt) – months for rolling average, default is 3

  • box_brd (bool, opt) – choose if coordinates are comprised or not. Default is True

  • rebuild (bool, opt) – if True, the index is recalculated, default is False

retrieve(reader_kwargs: dict = {}) None

Retrieve the data for the ENSO index.

Parameters:

reader_kwargs (dict) – Additional keyword arguments for the Reader. Default is an empty dictionary.

class aqua.diagnostics.EnsembleLatLon(var=None, dataset=None, catalog_list=None, model_list=None, exp_list=None, source_list=None, ensemble_dimension_name='ensemble', description=None, outputdir='./', loglevel='WARNING')

Bases: BaseMixin

A class to compute ensemble mean and standard deviation of a 2D (lon-lat) Dataset. Make sure that the dataset has correct lon-lat dimensions.

Parameters:
  • var (str) – Variable name.

  • dataset – xarray Dataset composed of ensembles 2D lon-lat data, i.e., the individual Dataset (lon-lat) are concatenated along. a new dimension “ensemble”. This ensemble name can be changed.

  • catalog_list (str) – This variable defines the catalog list. The default is ‘None’. If None, the variable is assigned to ‘None_catalog’. In case of Multi-catalogs, the variable is assigned to ‘multi-catalog’.

  • model_list (str) – This variable defines the model list. The default is ‘None’. If None, the variable is assigned to ‘None_model’. In case of Multi-Model, the variable is assigned to ‘multi-model’.

  • exp_list (str) – This variable defines the exp list. The default is ‘None’. If None, the variable is assigned to ‘None_exp’. In case of Multi-Exp, the variable is assigned to ‘multi-exp’.

  • source_list (str) – This variable defines the source list. The default is ‘None’. If None, the variable is assigned to ‘None_source’. In case of Multi-Source, the variable is assigned to ‘multi-source’.

  • ensemble_dimension_name="ensemble" (str) – a default name given to the dimensions along with the individual Datasets were concatenated.

  • outputdir (str) – String input for output path.

  • description (str) – Description of the netcdf.

  • loglevel (str) – Log level. Default is “WARNING”.

run()

A function to compute the mean and standard devivation of the input dataset It is import to make sure that the dim along which the mean is compute is correct. The default dim=”ensemble”.

class aqua.diagnostics.EnsembleTimeseries(var=None, hourly_data=None, daily_data=None, monthly_data=None, annual_data=None, catalog_list=None, model_list=None, exp_list=None, source_list=None, ensemble_dimension_name='ensemble', description=None, outputdir='./', loglevel='WARNING')

Bases: BaseMixin

This class computes mean and standard deviation of the timeseries ensemble.

NOTE: The STD is computed Point-wise along the mean.

Parameters:
  • var (str) – Variable name.

  • hourly_data – xarray Dataset of ensemble members of hourly timeseries. The ensemble memebers are concatenated along a new dimension “ensemble”.

  • daily_data – xarray Dataset of ensemble members of daily timeseries. The ensemble memebers are concatenated along a new dimension “ensemble”.

  • monthly_data – xarray Dataset of ensemble members of monthly timeseries. The ensemble memebers are concatenated along a new dimension “ensemble”.

  • annual_data – xarray Dataset of ensemble members of annual timeseries. The ensemble members are concatenated along the dimension “ensemble”

  • ensemble_dimension_name="ensemble" (str) – a default name given to the dimensions along with the individual Datasets were concatenated.

  • catalog_list (list) – list of catalog names.

  • model_list (list) – list of model names. This is mandotory.

  • exp_list (list) – list of experiment names.

  • source_list (list) – list of source list.

  • description (str) – Description of the netcdf.

  • outputdir (str) – String input for output path.

  • loglevel (str) – Log level. Default is “WARNING”.

run()

A function to compute the mean and standard devivation of the input dataset It is import to make sure that the dim along which the mean is compute is correct. The default dim=”ensemble”. TODO: Test DASK’s .compute() function here.

class aqua.diagnostics.EnsembleZonal(var=None, dataset=None, catalog_list=None, model_list=None, exp_list=None, source_list=None, ensemble_dimension_name='ensemble', outputdir='./', loglevel='WARNING')

Bases: BaseMixin

A class to compute ensemble mean and standard deviation of the Zonal averages Make sure that the dataset has correct lev-lat dimensions.

Parameters:
  • var (str) – Variable name.

  • dataset – xarray Dataset composed of ensembles 2D Zonal data, i.e., the individual Dataset (lev-lat) are concatenated along. a new dimension “ensemble”. This ensemble name can be changed.

  • catalog_list (str) – This variable defines the catalog list. The default is ‘None’. If None, the variable is assigned to ‘None_catalog’. In case of Multi-catalogs, the variable is assigned to ‘multi-catalog’.

  • model_list (str) – This variable defines the model list. The default is ‘None’. If None, the variable is assigned to ‘None_model’. In case of Multi-Model, the variable is assigned to ‘multi-model’.

  • exp_list (str) – This variable defines the exp list. The default is ‘None’. If None, the variable is assigned to ‘None_exp’. In case of Multi-Exp, the variable is assigned to ‘multi-exp’.

  • source_list (str) – This variable defines the source list. The default is ‘None’. If None, the variable is assigned to ‘None_source’. In case of Multi-Source, the variable is assigned to ‘multi-source’.

  • ensemble_dimension_name="ensemble" (str) – a default name given to the dimensions along with the individual Datasets were concatenated.

  • outputdir (str) – String input for output path.

  • loglevel (str) – Log level. Default is “WARNING”.

run()

A function to compute the mean and standard devivation of the input dataset It is import to make sure that the dim along which the mean is compute is correct. The default dim=”ensemble”.

class aqua.diagnostics.GlobalBiases(catalog=None, model=None, exp=None, source=None, regrid=None, startdate=None, enddate=None, var=None, plev=None, areas=True, diagnostic='globalbiases', save_netcdf=True, outputdir='./', loglevel='WARNING')

Bases: Diagnostic

Diagnostic class for computing global and seasonal climatologies of a given variable.

This class handles data retrieval, pressure level selection, unit conversion, and computation of mean climatologies (total or seasonal).

Inherits from Diagnostic.

Parameters:
  • catalog (str) – The catalog to be used. If None, inferred from Reader.

  • model (str) – Model to be used.

  • exp (str) – Experiment name.

  • source (str) – Source name.

  • regrid (str) – Target grid for regridding. If None, no regridding.

  • startdate (str) – Start date for data selection.

  • enddate (str) – End date for data selection.

  • var (str) – Variable name to analyze.

  • plev (float) – Pressure level to select (if applicable).

  • areas (bool) – if True, save area weights for statistics computation.

  • diagnostic (str) – Name of the diagnostic.

  • save_netcdf (bool) – If True, saves output climatologies.

  • outputdir (str) – Output directory for NetCDF files.

  • loglevel (str) – Log level. Default is ‘WARNING’.

Initialize the diagnostic class. This is a general purpose class that can be used by the diagnostic classes to retrieve data from a single model and to save the data to a netcdf file. It is not a working diagnostic class by itself.

Parameters:
  • model (str) – The model to be used.

  • exp (str) – The experiment to be used.

  • source (str) – The source to be used.

  • catalog (str) – The catalog to be used. If None, the catalog will be determined by the Reader.

  • regrid (str | None) – The target grid to be used for regridding. If None, no regridding will be done.

  • startdate (str | None) – The start date of the plot/analysis period. If None, all available data will be used.

  • enddate (str | None) – The end date of the plot/analysis period. If None, all available data will be used.

  • std_startdate (str | None) – The start date of the standard deviation period. If None, no std period is tracked at the Diagnostic level.

  • std_enddate (str | None) – The end date of the standard deviation period. If None, no std period is tracked at the Diagnostic level.

  • loglevel (str) – The log level to be used. Default is ‘WARNING’.

MINIMUM_MONTHS_REQUIRED = 12
compute_climatology(data: Dataset = None, var: str = None, plev: float = None, save_netcdf: bool = None, seasonal: bool = False, seasons_stat: str = 'mean', areas=False, create_catalog_entry: bool = False) None

Compute total and optionally seasonal climatology for a variable.

Parameters:
  • data (xarray.Dataset, optional) – Input dataset. If None, uses self.data.

  • var (str, optional) – Variable name. If None, uses self.var.

  • plev (float, optional) – Pressure level (currently unused).

  • save_netcdf (bool, optional) – If True, save output to NetCDF.

  • seasonal (bool) – If True, compute seasonal climatology (DJF, MAM, JJA, SON).

  • seasons_stat (str) – Aggregation statistic: ‘mean’, ‘std’, ‘max’, ‘min’.

  • areas (bool) – If True, include cell area in the output dataset.

  • create_catalog_entry (bool) – If True, create a catalog entry for the data. Default is False.

Raises:

ValueError – If seasons_stat is invalid.

retrieve(var: str = None, formula: bool = False, long_name: str = None, short_name: str = None, plev: float = None, units: str = None, reader_kwargs: dict = {}) None

Retrieve and preprocess dataset, selecting pressure level and/or converting units if needed.

Parameters:
  • var (str, optional) – Variable to retrieve. If None, uses self.var.

  • formula (bool) – If True, the variable is a formula.

  • long_name (str) – The long name of the variable, if different from the variable name.

  • short_name (str) – The short name of the variable, if different from the variable name.

  • plev (float, optional) – Pressure level to extract.

  • units (str) – The units of the variable, if different from the original units.

  • reader_kwargs (dict, optional) – Additional keyword arguments for the Reader.

Raises:
  • NoDataError – If variable not found in dataset.

  • KeyError – If the variable is missing from the data.

savenetcdf(data: Dataset, diagnostic_product: str, rebuild: bool = True, create_catalog_entry: bool = False, extra_keys=None, dict_catalog_entry: dict = {'jinjalist': ['realization'], 'wildcardlist': ['var']})

Save data to NetCDF with proper metadata.

Parameters:
  • data (xr.Dataset) – Input dataset.

  • diagnostic_product (str) – The product name to be used in the filename (e.g., ‘annual_climatology’).

  • rebuild (bool) – If True, rebuild the data from the original files.

  • create_catalog_entry (bool) – If True, create a catalog entry for the data. Default is False.

  • extra_keys (dict) – Extra keys for filename generation.

  • dict_catalog_entry (dict) – A dictionary with catalog entry information. Default is {‘jinjalist’: [‘freq’, ‘region’, ‘realization’], ‘wildcardlist’: [‘var’]}.

class aqua.diagnostics.GlobalMean(exp, year1, year2, config='config.yml', loglevel='WARNING', numproc=1, interface=None, model=None, ensemble='r1i1p1f1', addnan=False, silent=None, trend=None, line=None, outputdir=None, xdataset=None, reference='EC23', title=None)

Bases: object

exp

Experiment name.

Type:

str

year1

Start year of the experiment.

Type:

int

year2

End year of the experiment.

Type:

int

config

Path to the configuration file. Default is ‘config.yml’.

Type:

str

loglevel

Logging level. Default is ‘WARNING’.

Type:

str

numproc

Number of processes to use. Default is 1.

Type:

int

interface

Path to the interface file. Default is None.

Type:

str

model

Model name. Default is None.

Type:

str

ensemble

Ensemble identifier. Default is ‘r1i1p1f1’.

Type:

str

addnan

Whether to add NaNs. Default is False.

Type:

bool

silent

Whether to suppress output. Default is None.

Type:

bool

trend

Whether to compute trends. Default is None.

Type:

bool

line

Line identifier. Default is None.

Type:

str

outputdir

Output directory. Default is None.

Type:

str

xdataset

Path to the xdataset. Default is None.

Type:

str

loggy

Logger instance.

Type:

logging.Logger

diag

Diagnostic instance.

Type:

Diagnostic

face

Interface dictionary.

Type:

dict

ref

Reference dictionary.

Type:

dict

util_dictionary

Supporter instance.

Type:

Supporter

varmean

Dictionary to store variable means.

Type:

dict

vartrend

Dictionary to store variable trends.

Type:

dict

funcname

Name of the class.

Type:

str

start_time

Start time for the timer.

Type:

float

title

Title of the plot, overrides default title.

Type:

str

toc(message)

Update the timer and log the elapsed time.

prepare()

Prepare the necessary components for the global mean computation.

run()

Run the global mean computation using multiprocessing.

store()

Store the computed global mean values in a table and YAML file.

plot(mapfile=None, figformat='pdf')
gm_worker(util, ref, face, diag, varmean, vartrend, varlist)
final_toc()

Log the total elapsed time since the start.

static gm_worker(util, ref, face, diag, varmean, vartrend, varlist, loglevel)

” Workhorse for the global mean computation.

Parameters:
  • util (Supporter) – Utility dictionary for remapping and masks.

  • ref (dict) – Reference climatology dictionary.

  • face (dict) – Interface dictionary.

  • diag (Diagnostic) – Diagnostic instance.

  • varmean (dict) – Shared dictionary to store variable means.

  • vartrend (dict) – Shared dictionary to store variable trends.

  • varlist (list) – List of variables to process.

  • varlist – List of variables to process.

plot(diagname='global_mean', mapfile=None, figformat='pdf', storefig=True, returnfig=False, addnan=True)

Generate the heatmap for global mean.

Parameters:
  • diagname (str) – Name of the diagnostic. Default is ‘global_mean’.

  • mapfile (str) – Path to the output file. If None, it will be defined automatically following ECmean syntax.

  • figformat (str) – Format of the output file. Default is ‘pdf’.

  • storefig (bool) – If True, store the figure in the specified file. Default is True.

  • returnfig (bool) – If True, return the figure object. Default is False.

  • addnan (bool) – If True, add NaN values to the plot. Default is True.

prepare()

Prepare the necessary components for the global mean computation.

run()

Run the global mean computaacross all variables on using multiprocessing.

store(yamlfile=None, tablefile=None)

Rearrange the data and save the yaml file and the table. :param yamlfile: Path to the output YAML file. If None, it will be defined automatically. :param tablefile: Path to the output TXT file. If None, it will be defined automatically.

toc(message)

Update the timer and log the elapsed time.

class aqua.diagnostics.Gregory(diagnostic_name: str = 'gregory', catalog: str = None, model: str = None, exp: str = None, source: str = None, regrid: str = None, startdate: str = None, enddate: str = None, loglevel: str = 'WARNING')

Bases: Diagnostic

Initialize the Gregory Plot class. This evaluates values necessary for the Gregory Plot from a single model and to save the data to a netcdf file.

Parameters:
  • catalog (str) – The catalog to be used. If None, the catalog will be determined by the Reader.

  • model (str) – The model to be used.

  • exp (str) – The experiment to be used.

  • source (str) – The source to be used.

  • regrid (str) – The target grid to be used for regridding. If None, no regridding will be done.

  • startdate (str) – The start date of the data to be retrieved. If None, all available data will be retrieved.

  • enddate (str) – The end date of the data to be retrieved. If None, all available data will be retrieved.

  • loglevel (str) – The log level to be used. Default is ‘WARNING’.

Initialize the diagnostic class. This is a general purpose class that can be used by the diagnostic classes to retrieve data from a single model and to save the data to a netcdf file. It is not a working diagnostic class by itself.

Parameters:
  • model (str) – The model to be used.

  • exp (str) – The experiment to be used.

  • source (str) – The source to be used.

  • catalog (str) – The catalog to be used. If None, the catalog will be determined by the Reader.

  • regrid (str | None) – The target grid to be used for regridding. If None, no regridding will be done.

  • startdate (str | None) – The start date of the plot/analysis period. If None, all available data will be used.

  • enddate (str | None) – The end date of the plot/analysis period. If None, all available data will be used.

  • std_startdate (str | None) – The start date of the standard deviation period. If None, no std period is tracked at the Diagnostic level.

  • std_enddate (str | None) – The end date of the standard deviation period. If None, no std period is tracked at the Diagnostic level.

  • loglevel (str) – The log level to be used. Default is ‘WARNING’.

MINIMUM_MONTHS_REQUIRED = 2
compute_net_toa(freq: list = ['monthly', 'annual'], std: bool = False, exclude_incomplete=True)

Compute the net TOA radiation data.

Parameters:
  • freq (list) – The frequency of the data to be computed. Default is [‘monthly’, ‘annual’].

  • std (bool) – Whether to compute the standard deviation. Default is False.

  • exclude_incomplete (bool) – Whether to exclude incomplete timespans. Default is True.

compute_t2m(freq: list = ['monthly', 'annual'], std: bool = False, var: str = '2t', units: str = 'degC', exclude_incomplete=True)

Compute the 2m temperature data.

Parameters:
  • freq (list) – The frequency of the data to be computed. Default is [‘monthly’, ‘annual’].

  • std (bool) – Whether to compute the standard deviation. Default is False.

  • units (str) – The units of the data. Default is ‘degC’.

  • exclude_incomplete (bool) – Whether to exclude incomplete timespans. Default is True.

retrieve(t2m: bool = True, net_toa: bool = True, t2m_name: str = '2t', net_toa_name: str = 'tnlwrf+tnswrf', reader_kwargs: dict = {})

Retrieve the necessary data for the Gregory Plot.

Parameters:
  • t2m (bool) – Whether to retrieve the 2m temperature data. Default is True.

  • net_toa (bool) – Whether to retrieve the net TOA radiation data. Default is True.

  • t2m_name (str) – The name of the 2m temperature data.

  • net_toa_name (str) – The name of the net TOA radiation data.

  • reader_kwargs (dict) – Additional keyword arguments for the Reader. Default is an empty dictionary.

run(freq: list = ['monthly', 'annual'], t2m: bool = True, net_toa: bool = True, std: bool = False, t2m_name: str = '2t', net_toa_name: str = 'tnlwrf+tnswrf', t2m_units: str = 'degC', exclude_incomplete: bool = True, outputdir: str = './', rebuild: bool = True, reader_kwargs: dict = {})

Run the Gregory Plot.

Args:

freq (list): The frequency of the data to be computed. Default is [‘monthly’, ‘annual’]. t2m (bool): Whether to compute the 2m temperature data. Default is True. net_toa (bool): Whether to compute the net TOA radiation data. Default is True. std (bool): Whether to compute the standard deviation. Default is False. t2m_name (str): The name of the 2m temperature variable. Default is ‘2t’. net_toa_name (str): The name of the net TOA radiation formula. Default is ‘tnlwrf+tnswrf’. t2m_units (str): The units of the 2m temperature data. Default is ‘degC’. exclude_incomplete (bool): Whether to exclude incomplete timespans. Default is True. outputdir (str): The output directory to save the netcdf file. Default is ‘./’. rebuild (bool): Whether to rebuild the netcdf file. Default is True. reader_kwargs (dict): Additional keyword arguments for the Reader. Default is an empty dictionary.

save_netcdf(freq: list = ['monthly', 'annual'], std: bool = False, t2m: bool = True, net_toa: bool = True, outputdir: str = './', rebuild: bool = True)

Save the computed data to a netcdf file.

Parameters:
  • freq (list) – The frequency of the data to be saved. Default is [‘monthly’, ‘annual’].

  • std (bool) – Whether to save the standard deviation. Default is False.

  • t2m (bool) – Whether to save the 2m temperature data. Default is True.

  • net_toa (bool) – Whether to save the net TOA radiation data. Default is True.

  • outputdir (str) – The output directory to save the netcdf file. Default is ‘./’.

  • rebuild (bool) – Whether to rebuild the netcdf file. Default is True.

class aqua.diagnostics.Histogram(model: str, exp: str, source: str, catalog: str = None, regrid: str = None, startdate: str = None, enddate: str = None, region: str = None, lon_limits: list = None, lat_limits: list = None, regions_file_path: str = None, bins: int = 100, range: tuple = None, weighted: bool = True, diagnostic_name: str = 'histogram', loglevel: str = 'WARNING')

Bases: Diagnostic

Class to compute histograms and probability density functions (PDFs) of a variable over a specified region. Retrieves data from catalog, computes histograms/PDFs for the entire period, and saves results to netcdf files.

Initialize the Histogram diagnostic class.

Parameters:
  • model (str) – Model to be used for data retrieval.

  • exp (str) – Experiment to be used for data retrieval.

  • source (str) – Source to be used for data retrieval.

  • catalog (str, optional) – Catalog for data retrieval.

  • regrid (str, optional) – Regridding method.

  • startdate (str, optional) – Start date of data to retrieve.

  • enddate (str, optional) – End date of data to retrieve.

  • region (str, optional) – Region for data retrieval.

  • lon_limits (list, optional) – Longitude limits of region.

  • lat_limits (list, optional) – Latitude limits of region.

  • regions_file_path (str, optional) – Path to regions file.

  • bins (int, optional) – Number of bins for histogram. Default 100.

  • range (tuple, optional) – Range for histogram bins (min, max).

  • weighted (bool, optional) – Use latitudinal weights. Default True.

  • diagnostic_name (str, optional) – Name of diagnostic. Default ‘histogram’.

  • loglevel (str, optional) – Log level.

MINIMUM_MONTHS_REQUIRED = 12
compute_histogram(box_brd: bool = True, density: bool = True)

Compute histogram of the data for the entire period.

Parameters:
  • box_brd (bool) – Include box boundaries in area selection.

  • density (bool) – If True, returns PDF normalized to integrate to 1.

retrieve(var: str, formula: bool = False, long_name: str = None, units: str = None, standard_name: str = None, reader_kwargs: dict = {})

Retrieve data for the specified variable using the parent Diagnostic class.

Parameters:
  • var (str) – Variable to retrieve.

  • formula (bool) – Whether to use formula for variable.

  • long_name (str) – Long name of variable.

  • units (str) – Units of variable.

  • standard_name (str) – Standard name of variable.

  • reader_kwargs (dict) – Additional Reader kwargs.

run(var: str, formula: bool = False, long_name: str = None, units: str = None, standard_name: str = None, box_brd: bool = True, density: bool = True, outputdir: str = './', rebuild: bool = True, reader_kwargs: dict = {})

Run all steps for histogram computation.

Parameters:
  • var (str) – Variable to retrieve and compute.

  • formula (bool) – Use formula for variable.

  • long_name (str) – Long name of variable.

  • units (str) – Units of variable.

  • standard_name (str) – Standard name of variable.

  • box_brd (bool) – Include box boundaries.

  • density (bool) – Return PDF (normalized) instead of counts.

  • outputdir (str) – Output directory.

  • rebuild (bool) – Rebuild existing files.

  • reader_kwargs (dict) – Additional Reader kwargs.

save_netcdf(outputdir: str = './', rebuild: bool = True)

Save histogram data to netcdf file.

Parameters:
  • outputdir (str) – Output directory.

  • rebuild (bool) – Rebuild if file exists.

class aqua.diagnostics.Hovmoller(model: str, exp: str, source: str, catalog: str = None, regrid: str = None, startdate: str = None, enddate: str = None, diagnostic_name: str = 'oceandrift', vert_coord: str = 'level', loglevel: str = 'WARNING')

Bases: Diagnostic

A class for generating Hovmoller diagrams from ocean model data.

This class provides methods to retrieve, process, and save netCDF files for Hovmoller diagrams. It inherits from the Diagnostic class.

logger

Logger instance for the class.

Type:

Logger

outputdir

Directory to save the output files.

Type:

str

region

Region for area selection.

Type:

str

var

List of variables to process.

Type:

list

stacked_data

Processed data for Hovmoller diagrams.

Type:

xarray.Dataset

Initialize the Hovmoller class.

Parameters:
  • model (str) – Model name.

  • exp (str) – Experiment name.

  • source (str) – Data source.

  • catalog (str, optional) – Path to the catalog file.

  • regrid (str, optional) – Regridding method.

  • startdate (str, optional) – Start date for data retrieval.

  • enddate (str, optional) – End date for data retrieval.

  • diagnostic_name (str, optional) – Name of the diagnostic for filenames. Defaults to “oceandrift”.

  • vert_coord (str, optional) – Name of the vertical dimension coordinate. Defaults to DEFAULT_OCEAN_VERT_COORD.

  • loglevel (str, optional) – Logging level. Defaults to “WARNING”.

MINIMUM_MONTHS_REQUIRED = 2
compute_hovmoller(dim_mean: str = None, anomaly_ref: str | list = None)

Process input data for drift analysis by applying various transformations and aggregations.

Parameters:
  • dim_mean (str or None) – The dimension along which to compute the mean. If None, no mean is computed.

  • anomaly_ref (str or list, optional) – Reference for anomaly calculation. Can be “t0”, “tmean”, or None. By default, full values are used.

Returns:

A concatenated DataArray containing processed data for different combinations of anomaly, standardization, and anomaly reference types.

Return type:

xarray.DataArray

run(outputdir: str = '.', rebuild: bool = True, region: str = None, var: list = ['thetao', 'so'], dim_mean=['lat', 'lon'], anomaly_ref: str = None, reader_kwargs: dict = {})

Run the Hovmoller diagram generation workflow.

This method retrieves the specified variables, applies region selection if provided, computes Hovmoller diagrams with optional mean and anomaly processing, and saves the results to netCDF files.

Parameters:
  • outputdir (str, optional) – Directory to save the output files. Defaults to “.”.

  • rebuild (bool, optional) – Whether to rebuild the netCDF file. Defaults to True.

  • region (str, optional) – Region for area selection. Defaults to None (global evaluation).

  • var (list, optional) – List of variables to process. Defaults to [“thetao”, “so”].

  • dim_mean (list, optional) – List of dimensions over which to compute the mean. Defaults to [“lat”, “lon”].

  • anomaly_ref (str or None, optional) – Reference for anomaly calculation. Can be “t0”, “tmean”, or None.

  • reader_kwargs (dict, optional) – Additional keyword arguments for the Reader. Defaults to {}.

save_netcdf(diagnostic_product: str = 'hovmoller', region: str = None, outputdir: str = '.', rebuild: bool = True)

Save the processed data to a netCDF file.

Parameters:
  • diagnostic_product (str) – Name of the diagnostic product.

  • region (str) – Region for area selection. Defaults to None.

  • outputdir (str) – Directory to save the output files. Defaults to ‘.’.

  • rebuild (bool, optional) – Whether to rebuild the netCDF file. Defaults to True.

sort_key(data)

Return a sort key for ordering processed data by drift type.

class aqua.diagnostics.LatLonProfiles(model: str, exp: str, source: str, catalog: str = None, regrid: str = None, startdate: str = None, enddate: str = None, std_startdate: str = None, std_enddate: str = None, region: str = None, lon_limits: list = None, lat_limits: list = None, regions_file_path: str = None, mean_type: str = 'zonal', diagnostic_name: str = 'latlonprofile', loglevel: str = 'WARNING')

Bases: Diagnostic

Class to compute lat-lon profiles of a variable over a specified region. It retrieves the data from the catalog, computes the mean and standard deviation over the specified period and saves the results to netcdf files.

Supported Frequencies:
  • ‘seasonal’: Computes seasonal means (DJF, MAM, JJA, SON)

  • ‘longterm’: Computes the temporal mean over the entire analysis period

Supported Mean Types:
  • ‘zonal’: Average over longitude, producing latitude profiles

  • ‘meridional’: Average over latitude, producing longitude profiles

Initialize the LatLonProfiles class.

Parameters:
  • model (str) – The model to be used for the retrieval of the data.

  • exp (str) – The experiment to be used for the retrieval of the data.

  • source (str) – The source to be used for the retrieval of the data.

  • catalog (str, optional) – The catalog to be used for the retrieval of the data.

  • regrid (str, optional) – The regridding method to be used for the retrieval of the data.

  • startdate (str, optional) – The start date of the plot/analysis period.

  • enddate (str, optional) – The end date of the plot/analysis period.

  • std_startdate (str, optional) – The start date of the standard deviation period.

  • std_enddate (str, optional) – The end date of the standard deviation period.

  • region (str, optional) – The region to be used for the retrieval of the data.

  • lon_limits (list, optional) – The longitude limits of the region.

  • lat_limits (list, optional) – The latitude limits of the region.

  • regions_file_path (str, optional) – The path to the regions file. Default is the AQUA config path.

  • mean_type (str, optional) – The type of mean to compute (‘zonal’ or ‘meridional’).

  • diagnostic_name (str, optional) – The name of the diagnostic.

  • loglevel (str, optional) – The log level to be used for the logging.

MINIMUM_MONTHS_REQUIRED = 12
compute_dim_mean(freq: str, exclude_incomplete: bool = True, center_time: bool = True, box_brd: bool = True)

Compute the mean of the data. Support for seasonal and longterm means.

Parameters:
  • freq (str) – The frequency to be used (‘seasonal’ or ‘longterm’).

  • exclude_incomplete (bool) – If True, exclude incomplete periods.

  • center_time (bool) – If True, the time will be centered.

  • box_brd (bool,opt) – choose if coordinates are comprised or not in area selection. Default is True

compute_std(freq: str, exclude_incomplete: bool = True, center_time: bool = True, box_brd: bool = True)

Compute the standard deviation of the data over the std period. Supports seasonal and longterm frequencies.

Parameters:
  • freq (str) – The frequency to be used (‘seasonal’ or ‘longterm’).

  • exclude_incomplete (bool) – If True, exclude incomplete periods.

  • center_time (bool) – If True, the time will be centered.

  • box_brd (bool,opt) – choose if coordinates are comprised or not in area selection. Default is True

retrieve(var: str, formula: bool = False, long_name: str = None, units: str = None, standard_name: str = None, reader_kwargs: dict = {})

Retrieve the data for the specified variable and apply any formula if required.

Parameters:
  • var (str) – The variable to be retrieved.

  • formula (bool) – Whether to use a formula for the variable.

  • long_name (str) – The long name of the variable.

  • units (str) – The units of the variable.

  • standard_name (str) – The standard name of the variable.

  • reader_kwargs (dict) – Additional keyword arguments for the Reader. Default is an empty dictionary.

run(var: str, formula: bool = False, long_name: str = None, units: str = None, standard_name: str = None, std: bool = False, freq: list = ['seasonal', 'longterm'], exclude_incomplete: bool = True, center_time: bool = True, box_brd: bool = True, outputdir: str = './', rebuild: bool = True, reader_kwargs: dict = {})

Run all the steps necessary for the computation of the LatLonProfiles.

Parameters:
  • var (str) – The variable to be retrieved and computed.

  • formula (bool) – Whether to use a formula for the variable.

  • long_name (str) – The long name of the variable.

  • units (str) – The units of the variable.

  • standard_name (str) – The standard name of the variable.

  • std (bool) – Whether to compute the standard deviation.

  • freq (list) – The frequencies to compute. Options: - ‘seasonal’: Seasonal means (DJF, MAM, JJA, SON) - ‘longterm’: Long-term mean over the entire analysis period

  • exclude_incomplete (bool) – Whether to exclude incomplete time periods.

  • center_time (bool) – Whether to center the time coordinate.

  • box_brd (bool) – Whether to include the box boundaries.

  • outputdir (str) – The output directory to save the results.

  • rebuild (bool) – Whether to rebuild existing files.

  • reader_kwargs (dict) – Additional keyword arguments for the Reader. Default is an empty dictionary.

save_netcdf(freq: str, outputdir: str = './', rebuild: bool = True)

Save the data to a netcdf file.

Parameters:
  • freq (str) – The frequency of the data (‘seasonal’ or ‘longterm’).

  • outputdir (str) – The directory to save the data.

  • rebuild (bool) – If True, rebuild the data from the original files.

class aqua.diagnostics.MJO(catalog: str = None, model: str = None, exp: str = None, source: str = None, regrid: str = None, startdate: str = None, enddate: str = None, configdir: str = None, definition: str = 'teleconnections-destine', loglevel: str = 'WARNING')

Bases: BaseMixin

MJO (Madden-Julian Oscillation) class.

Initialize the MJO class.

Parameters:
  • catalog (str) – Catalog name.

  • model (str) – Model name.

  • exp (str) – Experiment name.

  • source (str) – Source name.

  • regrid (str) – Regrid method.

  • startdate (str) – Start date for data retrieval.

  • enddate (str) – End date for data retrieval.

  • configdir (str) – Configuration directory. Default is the installation directory.

  • definition (str) – definition filename. Default is ‘teleconnections-destine’.

  • loglevel (str) – Logging level. Default is ‘WARNING’.

MINIMUM_MONTHS_REQUIRED = 24
compute_hovmoller(day_window: int = None)

Compute the Hovmoller plot for the MJO index. This method prepares the data for a Hovmoller plot by selecting the MJO box, evaluating anomalies, and smoothing the data if required.

Parameters:

day_window (int, optional) – Number of days to be used in the smoothing window. If None, no smoothing is performed. Default is None.

retrieve(reader_kwargs: dict = {}) None

Retrieve the data for the MJO Hovmoller plot.

Parameters:

reader_kwargs (dict) – Additional keyword arguments for the Reader. Default is an empty dictionary.

class aqua.diagnostics.NAO(catalog: str = None, model: str = None, exp: str = None, source: str = None, regrid: str = None, startdate: str = None, enddate: str = None, configdir: str = None, definition: str = 'teleconnections-destine', loglevel: str = 'WARNING')

Bases: BaseMixin

North Atlantic Oscillation (NAO) index calculation class. This class is used to calculate the NAO index from a given dataset. It inherits from the BaseMixin class and implements the necessary methods to calculate the NAO index.

Initialize the NAO class.

Parameters:
  • catalog (str) – Catalog name.

  • model (str) – Model name.

  • exp (str) – Experiment name.

  • source (str) – Source name.

  • regrid (str) – Regrid method.

  • startdate (str) – Start date for data retrieval.

  • enddate (str) – End date for data retrieval.

  • configdir (str) – Configuration directory. Default is the installation directory.

  • definition (str) – definition filename. Default is ‘teleconnections-destine’.

  • loglevel (str) – Logging level. Default is ‘WARNING’.

MINIMUM_MONTHS_REQUIRED = 24
compute_index(months_window: int = 3, rebuild: bool = False)

” Evaluate station based index for a teleconnection. Field data must be monthly gridded data.

Parameters:
  • months_window (int, opt) – months for rolling average, default is 3

  • rebuild (bool, opt) – if True, the index is recalculated, default is False

retrieve(reader_kwargs: dict = {}) None

Retrieve the data for the NAO index.

Parameters:

reader_kwargs (dict) – Additional keyword arguments for the Reader. Default is an empty dictionary.

class aqua.diagnostics.PerformanceIndices(exp, year1, year2, config='config.yml', loglevel='WARNING', numproc=1, climatology=None, interface=None, model=None, ensemble='r1i1p1f1', silent=None, xdataset=None, outputdir=None, extrafigure=False, title=None)

Bases: object

Class to compute the performance indices for a given experiment and years.

exp

Experiment name.

Type:

str

year1

Start year of the experiment.

Type:

int

year2

End year of the experiment.

Type:

int

config

Path to the configuration file. Default is ‘config.yml’.

Type:

str

loglevel

Logging level. Default is ‘WARNING’.

Type:

str

numproc

Number of processes to use. Default is 1.

Type:

int

climatology

Climatology to use. Default is ‘EC24’.

Type:

str

interface

Path to the interface file.

Type:

str

model

Model name.

Type:

str

ensemble

Ensemble identifier. Default is ‘r1i1p1f1’.

Type:

str

silent

If True, suppress output. Default is None.

Type:

bool

xdataset

Dataset to use.

Type:

xarray.Dataset

outputdir

Directory to store output files.

Type:

str

loggy

Logger instance.

Type:

logging.Logger

diag

Diagnostic instance.

Type:

Diagnostic

face

Interface dictionary.

Type:

dict

piclim

Climatology dictionary.

Type:

dict

util_dictionary

Utility dictionary for remapping and masks.

Type:

Supporter

varstat

Dictionary to store variable statistics.

Type:

dict

funcname

Name of the class.

Type:

str

start_time

Start time for performance measurement.

Type:

float

title

Title of the plot, overrides default title.

Type:

str

toc(message)

Update the timer and log the elapsed time.

prepare()

Prepare the necessary components for performance indices calculation.

run()

Run the performance indices calculation.

store(yamlfile=None)

Store the performance indices in a yaml file.

plot(mapfile=None, figformat='pdf')

Generate the heatmap for performance indices.

pi_worker(util, piclim, face, diag, field_3d, varstat, varlist)

Main parallel diagnostic worker for performance indices.

Initialize the PerformanceIndices class with the given parameters.

final_toc()

Log the total elapsed time since the start.

static pi_worker(util, piclim, face, diag, field_3d, varstat, dictarray, varlist, loglevel)

Main parallel diagnostic worker for performance indices.

Parameters:
  • util (Supporter) – Utility dictionary for remapping and masks.

  • piclim (dict) – Climatology dictionary.

  • face (dict) – Interface dictionary.

  • diag (Diagnostic) – Diagnostic instance.

  • field_3d (list) – List of 3D fields.

  • varstat (dict) – Dictionary to store variable statistics.

  • dictarray (dict) – Dictionary to store the output array.

  • varlist (list) – List of variables to process.

plot(diagname='performance_indices', mapfile=None, figformat='pdf', storefig=True, returnfig=False)

Generate the heatmap for performance indices.

Parameters:
  • diagname (str) – Name of the diagnostic. Default is ‘performance_indices’.

  • mapfile (str) – Path to the output file. If None, it will be defined automatically following ECmean syntax.

  • storefig (bool) – If True, store the figure in the specified file. Default is True.

  • returnfig (bool) – If True, return the figure object. Default is False.

prepare()

Prepare the necessary components for performance indices calculation.

run()

Run the performance indices calculation.

store(yamlfile=None)

Store the performance indices in a yaml file.

toc(message)

Update the timer and log the elapsed time.

class aqua.diagnostics.Plot2DSeaIce(ref=None, models=None, regions_to_plot: list = ['Arctic', 'Antarctic'], outputdir='./', rebuild=True, dpi=300, loglevel='WARNING')

Bases: object

A class for processing and visualizing surface maps and biases of sea ice fraction or thickness.

Parameters:
  • ref (xarray.DataArray or xarray.Dataset) – Reference sea ice data.

  • models (list of xarray.DataArray or xarray.Dataset) – List of models with sea ice data.

  • regions_to_plot (list) – List of strings with the region names to plot which must match the ‘AQUA_region’ attribute in the data provided as input.

  • outputdir (str) – Output directory for saving plots.

  • rebuild (bool) – Whether to rebuild the plots if they already exist.

  • dpi (int) – Dots per inch for the saved figures.

  • loglevel (str) – Logging level for the logger. Default is ‘WARNING’.

plot_2d_seaice(plot_type='var', months=[3, 9], method='fraction', projkw=None, plot_ref_contour=False, save_format=['png', 'pdf', 'svg'], show=False, **kwargs)

Plot sea ice data and biases.

Parameters:
  • plot_type (str) – Type of plot to generate [‘var’ or ‘bias’].

  • months (list) – List of months to plot, e.g. [2, 9] for February and September.

  • projkw (dict) – Dictionary with projection parameters for the plot.

  • save_format (str or list, optional) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • plot_ref_contour (bool) – Whether to add a reference line at 0.2 for sea ice fraction.

  • show (bool) – If True, display the plot interactively (e.g., in Jupyter notebooks).

  • **kwargs – Additional keyword arguments for customization. See below functions for details.

class aqua.diagnostics.PlotBoxplots(diagnostic='boxplots', save_format=['png', 'pdf', 'svg'], dpi=300, outputdir='./', loglevel='WARNING')

Bases: object

Initialize the PlotBoxplots class.

Parameters:
  • diagnostic (str) – Name of the diagnostic.

  • save_format (str, list) – Format(s) to save the figure in (e.g. ‘png’, ‘pdf’, ‘svg’).

  • dpi (int) – Resolution of saved figures.

  • outputdir (str) – Output directory for saved plots.

  • loglevel (str) – Logging level.

plot_boxplots(data, data_ref=None, var=None, anomalies=False, add_mean_line=False, ref_number=0, title=None)

Plot boxplots for specified variables in the dataset.

Parameters:
  • data (xarray.Dataset or list of xarray.Dataset) – Input dataset(s) containing the fldmeans of the variables to plot.

  • data_ref (xarray.Dataset or list of xarray.Dataset, optional) – Reference dataset(s) for comparison.

  • var (str or list of str) – Variable name(s) to plot. If None, uses all variables in the dataset.

  • anomalies (bool) – Whether to plot anomalies instead of absolute values.

  • add_mean_line (bool) – Whether to add dashed lines for means.

  • ref_number (int) – Position of reference dataset in data_ref list to use when plotting anomalies.

  • title (str, optional) – Title for the plot. If None, a default title will be generated.

class aqua.diagnostics.PlotENSO(indexes=None, ref_indexes=None, outputdir: str = './', rebuild: bool = True, loglevel: str = 'WARNING')

Bases: PlotBaseMixin

Plot the ENSO products.

Parameters:
  • indexes (list) – List of indexes to plot.

  • ref_indexes (list) – List of reference indexes to plot.

  • outputdir (str) – Directory to save the plots. Default is ‘./’.

  • rebuild (bool) – If True, rebuild the plots. Default is True.

  • loglevel (str) – Log level for the logger. Default is ‘WARNING’.

plot_index(thresh: float = 0.5)
plot_maps(maps=None, ref_maps=None, statistic: str = None, vmin: float = None, vmax: float = None, vmin_diff: float = None, vmax_diff: float = None, **kwargs)

Plot the maps for the ENSO products.

Parameters:
  • maps (list) – List of maps to plot.

  • ref_maps (list) – List of reference maps to plot.

  • statistic (str) – Statistic to plot. Default is None.

  • vmin (float) – Minimum value for the color value. Default is None.

  • vmax (float) – Maximum value for the color value. Default is None.

  • vmin_diff (float) – Minimum value for the color value for the difference. Default is None.

  • vmax_diff (float) – Maximum value for the color value for the difference. Default is None.

  • **kwargs – Additional arguments for the plotting function.

Returns:

Figure object.

Return type:

fig

set_index_description()

Set the description of the index. This is used to generate the caption of the figure.

Parameters:

index_name (str) – The name of the index. Default is None.

Returns:

The caption of the figure.

Return type:

str

set_map_description(maps=None, ref_maps=None, statistic: str = None)

Set the description for the maps.

Parameters:
  • maps (list) – List of maps to plot.

  • ref_maps (list) – List of reference maps to plot.

  • statistic (str) – Statistic to plot. Default is None.

Returns:

Description of the maps.

Return type:

str

class aqua.diagnostics.PlotEnsembleLatLon(diagnostic_product: str = 'EnsembleLatLon', catalog_list: list[str] = None, model_list: list[str] = None, exp_list: list[str] = None, source_list: list[str] = None, region: str = None, outputdir='./', loglevel: str = 'WARNING')

Bases: BaseMixin

Class to plot the ensmeble lat-lon

Class for plotting ensemble latitude-longitude (Lat-Lon) data.

This class inherits from BaseMixin and provides functionality to generate plots of ensemble datasets on a latitude-longitude grid. It supports multiple catalogs, models, experiments, and sources, and allows saving plots as PNG or PDF files. The class is intended for ensemble statistics visualization, such as mean and standard deviation maps.

Parameters:
  • diagnostic_product (str, optional) – Name of the diagnostic product. Defaults to “EnsembleLatLon”.

  • catalog_list (list[str], optional) – List of catalog names. If None, assigned to ‘None_catalog’.

  • model_list (list[str], optional) – List of model names. If None, assigned to ‘None_model’.

  • exp_list (list[str], optional) – List of experiment names. If None, assigned to ‘None_exp’.

  • source_list (list[str], optional) – List of data source names. If None, assigned to ‘None_source’.

  • region (str, optional) – Name of the region for plotting. Defaults to None.

  • outputdir (str, optional) – Directory to save output plots. Defaults to “./”.

  • loglevel (str, optional) – Logging level. Defaults to “WARNING”.

figure

The figure object for the plot.

Type:

matplotlib.figure.Figure or None

diagnostic_product

Name of the diagnostic product being visualized.

Type:

str

catalog_list

List of catalogs being processed.

Type:

list[str]

model_list

List of models being processed.

Type:

list[str]

exp_list

List of experiments being processed.

Type:

list[str]

source_list

List of sources being processed.

Type:

list[str]

region

Region name for plotting.

Type:

str

outputdir

Directory path for saving plots.

Type:

str

loglevel

Logging level for messages.

Type:

str

Notes

  • Designed to visualize ensemble mean and standard deviation on Lat-Lon grids.

  • Integrates with BaseMixin for consistent handling of catalogs, models, and experiments.

  • Uses self.save_figure for saving output plots in PNG and PDF formats.

plot(var: str = None, dataset_mean=None, dataset_std=None, long_name=None, description=None, dpi=300, title_mean=None, title_std=None, save_format=['png', 'pdf', 'svg'], vmin_mean=None, vmax_mean=None, vmin_std=None, vmax_std=None, proj='robinson', proj_params={}, transform_first=False, cyclic_lon=True, contour=True, coastlines=True, cbar_label=None, units=None)

Plot ensemble mean and standard deviation on a latitude-longitude map.

Generates 2D maps of ensemble mean and standard deviation for a given variable using the specified projection and visualization options. The resulting figures can be saved as PNG and/or PDF files.

Parameters:
  • var (str) – Variable name to plot.

  • dataset_mean (xarray.DataArray or Dataset) – Ensemble mean dataset.

  • dataset_std (xarray.DataArray or Dataset) – Ensemble standard deviation dataset.

  • long_name (str, optional) – Long descriptive name for the variable. Defaults to None.

  • description (str, optional) – Description string for saving the plot. Defaults to None.

  • dpi (int, optional) – Resolution for saved figures. Default is 300.

  • title_mean (str, optional) – Title for mean plot. Auto-generated if None.

  • title_std (str, optional) – Title for standard deviation plot. Auto-generated if None.

  • save_format (str or list, optional) – Format(s) to save figures in (e.g. ‘png’, ‘pdf’, ‘svg’). Default is SAVE_FORMAT.

  • vmin_mean (float, optional) – Color scale limits for mean plot. Auto-set if None.

  • vmax_mean (float, optional) – Color scale limits for mean plot. Auto-set if None.

  • vmin_std (float, optional) – Color scale limits for std plot. Auto-set if None.

  • vmax_std (float, optional) – Color scale limits for std plot. Auto-set if None.

  • proj (str, optional) – Map projection. Default is “robinson”.

  • proj_params (dict, optional) – Extra parameters for the projection. Defaults to {}.

  • transform_first (bool, optional) – Whether to transform data before plotting. Default is False.

  • cyclic_lon (bool, optional) – Whether longitude is cyclic. Default is False.

  • contour (bool, optional) – Overlay contours. Default is True.

  • coastlines (bool, optional) – Draw coastlines. Default is True.

  • cbar_label (str, optional) – Label for the colorbar. Auto-generated if None.

  • units (str, optional) – Units of the variable. Used for titles and labels.

Returns:

Dictionary containing figure and axes for mean and std plots:

{‘mean_plot’: [fig1, ax1], ‘std_plot’: [fig2, ax2]}. If standard deviation is zero everywhere, only ‘mean_plot’ is returned.

Return type:

dict

Raises:

NoDataError – If dataset_mean or dataset_std is None.

Notes

  • Titles and colorbar labels are automatically generated if not provided.

  • Uses self.save_figure to save figures in the formats specified.

  • Handles both xarray.DataArray and Dataset inputs.

  • If vmin_std equals vmax_std, std plot is skipped.

class aqua.diagnostics.PlotEnsembleTimeseries(diagnostic_product: str = 'EnsembleTimeseries', catalog_list: list[str] = None, model_list: list[str] = None, exp_list: list[str] = None, source_list: list[str] = None, ref_catalog: str = None, ref_model: str = None, ref_exp: str = None, region: str = None, outputdir='./', loglevel: str = 'WARNING')

Bases: BaseMixin

Class to plot the ensmeble timeseries

Parameters:
  • diagnostic_name (str) – The name of the diagnostic. Default is ‘ensemble’. This will be used to configure the logger and the output files.

  • catalog_list (str) – This variable defines the catalog list. The default is ‘None’. If None, the variable is assigned to ‘None_catalog’. In case of Multi-catalogs, the variable is assigned to ‘multi-catalog’.

  • model_list (str) – This variable defines the model list. The default is ‘None’. If None, the variable is assigned to ‘None_model’. In case of Multi-Model, the variable is assigned to ‘multi-model’.

  • exp_list (str) – This variable defines the exp list. The default is ‘None’. If None, the variable is assigned to ‘None_exp’. In case of Multi-Exp, the variable is assigned to ‘multi-exp’.

  • source_list (str) – This variable defines the source list. The default is ‘None’. If None, the variable is assigned to ‘None_source’. In case of Multi-Source, the variable is assigned to ‘multi-source’.

  • ref_catalog (str) – This is specific to timeseries reference data catalog. Default is None.

  • ref_model (str) – This is specific to timeseries reference data model. Default is None.

  • ref_exp (str) – This is specific to timeseries reference data exp. Default is None.

  • ensemble_dimension_name="ensemble" (str) – a default name given to the dimensions along with the individual Datasets were concatenated.

  • outputdir (str) – String input for output path. Default is ‘./’

  • loglevel (str) – Log level. Default is “WARNING”.

plot(var=None, title=None, startdate=None, enddate=None, hourly_data=None, hourly_data_mean=None, hourly_data_std=None, daily_data=None, daily_data_mean=None, daily_data_std=None, monthly_data=None, monthly_data_mean=None, monthly_data_std=None, annual_data=None, annual_data_mean=None, annual_data_std=None, ref_hourly_data=None, ref_daily_data=None, ref_monthly_data=None, ref_annual_data=None, description=None, save_format=['png', 'pdf', 'svg'], dpi=300, figure_size=[10, 5], plot_ensemble_members=True)

This plots the ensemble mean and +/- 2 x standard deviation of the ensemble statistics around the ensemble mean. In this method, it is also possible to plot the individual ensemble members. It does not plots +/- 2 x STD for the referene.

Parameters:
  • title (str) – Title for plot.

  • startdate (str) – startdate to be included in title if ‘None’. Default is ‘None’.

  • enddate (str) – enddate to be included in title if ‘None’. Default is ‘None’.

  • description (str) – specific for saving the plot.

  • figure_size – figure_size can be changed. Default is [10, 5],

  • save_format (str or list) – Format(s) to save the figure in (e.g. ‘png’, ‘pdf’, ‘svg’). Default is SAVE_FORMAT.

  • dpi (int) – Resolution for saved figures. Default is 300.

  • plot_ensemble_members=True.

  • ref_hourly_data – reference hourly timesereis xarray.Dataset. Default is None.

  • ref_daily_data – reference daily timeseries xarray.Dataset. Default is None.

  • ref_monthly_data – reference monthly timeseries xarray.Dataset. Default is None.

  • ref_annual_data – reference annual timeseries xarray.Dataset. Default is None.

  • hourly_data – xarray Dataset of ensemble members of hourly timeseries. The ensemble memebers are concatenated along a new dimension “ensemble”.

  • hourly_data_mean – None

  • hourly_data_std – None

  • daily_data – xarray Dataset of ensemble members of daily timeseries. The ensemble memebers are concatenated along a new dimension “ensemble”.

  • daily_data_mean – None

  • daily_data_std – None

  • monthly_data – xarray Dataset of ensemble members of monthly timeseries. The ensemble memebers are concatenated along a new dimension “ensemble”.

  • annual_data – xarray Dataset of ensemble members of annual timeseries. The ensemble members are concatenated along the dimension “ensemble”

  • ensemble_dimension_name="ensemble" (str) – a default name given to the dimensions along with the individual Datasets were concatenated.

  • monthly_data_mean – xarray.Dataset timeseries monthly mean.

  • monthly_data_std – xarray.Dataset timeseries monthly std.

  • annual_data_mean – xarray.Dataset timeseries annual mean.

  • annual_data_std – xarray.Dataset timeseries annual std.

Returns:

fig, ax

NOTE: The STD is computed and plotted Point-wise along the mean.

class aqua.diagnostics.PlotEnsembleZonal(diagnostic_product: str = 'EnsembleZonal', catalog_list: list[str] = None, model_list: list[str] = None, exp_list: list[str] = None, source_list: list[str] = None, region: str = None, outputdir='./', loglevel: str = 'WARNING')

Bases: BaseMixin

Class for plotting ensemble zonal mean data.

This class inherits from BaseMixin and provides functionality to visualize ensemble datasets as zonal averages. It supports multiple catalogs, models, experiments, and sources, and allows specifying a region for the analysis. The resulting plots can be saved to a specified output directory.

Parameters:
  • diagnostic_product (str, optional) – Name of the diagnostic product. Defaults to “EnsembleZonal”.

  • catalog_list (list[str], optional) – List of catalog names. If None, assigned to ‘None_catalog’.

  • model_list (list[str], optional) – List of model names. If None, assigned to ‘None_model’.

  • exp_list (list[str], optional) – List of experiment names. If None, assigned to ‘None_exp’.

  • source_list (list[str], optional) – List of source names. If None, assigned to ‘None_source’.

  • region (str, optional) – Name of the region for zonal averaging. Defaults to None.

  • outputdir (str, optional) – Directory path to save plots. Defaults to “./”.

  • loglevel (str, optional) – Logging level. Defaults to “WARNING”.

diagnostic_product

Name of the diagnostic product.

Type:

str

catalog_list

List of catalogs being processed.

Type:

list[str]

model_list

List of models being processed.

Type:

list[str]

exp_list

List of experiments being processed.

Type:

list[str]

source_list

List of sources being processed.

Type:

list[str]

region

Region used for zonal analysis.

Type:

str

outputdir

Output directory for saving plots.

Type:

str

loglevel

Logging level for messages.

Type:

str

plot(var: str = None, dataset_mean=None, dataset_std=None, description=None, title_mean=None, title_std=None, figure_size=[10, 8], cbar_label=None, save_format=['png', 'pdf', 'svg'], dpi=300, units=None, ylim=(5500, 0), levels=20, cmap='RdBu_r', ylabel='Depth (in m)', xlabel='Latitude (in deg North)')

Plot ensemble mean and standard deviation of zonal averages in Lev-Lat coordinates.

This method generates contour plots of the ensemble mean and standard deviation for a given variable on a latitude vs. vertical level (Lev) grid. The resulting plots can be saved as PNG and/or PDF files using the save_figure method. find_vert_coord from aqua.core.util is used to read the vertical coordiante.

Parameters:
  • var (str) – Name of the variable to plot.

  • dataset_mean (xarray.DataArray or xarray.Dataset) – Ensemble mean data.

  • dataset_std (xarray.DataArray or xarray.Dataset) – Ensemble standard deviation data.

  • description (str, optional) – Description for saving the plots.

  • title_mean (str, optional) – Title for the mean plot. Auto-generated if None.

  • title_std (str, optional) – Title for the standard deviation plot. Auto-generated if None.

  • figure_size (list[int], optional) – Figure size [width, height]. Default is [10, 8].

  • cbar_label (str, optional) – Label for the colorbar.

  • save_format (str or list, optional) – Format(s) to save plots in (e.g. ‘png’, ‘pdf’, ‘svg’). Default is SAVE_FORMAT.

  • dpi (int, optional) – Resolution for saved figures. Default is 300.

  • units (str, optional) – Units of the variable. Used in titles and labels if provided.

  • ylim (tuple, optional) – Y-axis limits for the plot (vertical levels). Default is (5500, 0).

  • levels (int, optional) – Number of contour levels. Default is 20.

  • cmap (str, optional) – Colormap to use. Default is “RdBu_r”.

  • ylabel (str, optional) – Label for y-axis. Default is “Depth (in m)”.

  • xlabel (str, optional) – Label for x-axis. Default is “Latitude (in deg North)”.

Returns:

Dictionary containing figure and axes objects for mean and std plots:

{"mean_plot": [fig1, ax1], "std_plot": [fig2, ax2]}

Return type:

dict

Raises:

NoDataError – If dataset_mean or dataset_std is None.

Notes

  • Automatically generates titles for mean and STD if not provided.

  • Uses self.save_figure to save the plots as PNG and PDF.

  • Designed for zonal mean visualizations in Lev-Lat coordinates.

  • Default y-axis (vertical levels) is set to descend from 5500 m to 0 m.

class aqua.diagnostics.PlotGlobalBiases(diagnostic='globalbiases', save_format=['png', 'pdf', 'svg'], dpi=300, outputdir='./', cmap='RdBu_r', return_fig: bool = False, loglevel='WARNING')

Bases: object

Initialize the PlotGlobalBiases class.

Parameters:
  • diagnostic (str) – Name of the diagnostic.

  • save_format (str or list) – Format(s) to save the figures. Default is SAVE_FORMAT.

  • dpi (int) – Resolution of saved figures.

  • outputdir (str) – Output directory for saved plots.

  • cmap (str) – Colormap to use for the plots.

  • return_fig (bool) – Whether plotting methods should return the figure and axes.

  • loglevel (str) – Logging level.

plot_bias(data, data_ref, var, plev=None, proj='robinson', proj_params={}, vmin=None, vmax=None, cbar_label=None, area=None, show_stats=False, data_timeseries=None, data_ref_timeseries=None, show_significance=False, significance_alpha=0.05, stipple_density=3, stipple_size=0.5, invert_stippling=False)

Plots the bias map between two datasets.

Parameters:
  • data (xarray.Dataset) – Primary dataset.

  • data_ref (xarray.Dataset) – Reference dataset.

  • var (str) – Variable name.

  • plev (float, optional) – Pressure level.

  • proj (str, optional) – Desired projection for the map.

  • proj_params (dict, optional) – Additional arguments for the projection.

  • vmin (float, optional) – Minimum colorbar value.

  • vmax (float, optional) – Maximum colorbar value.

  • cbar_label (str, optional) – Label for the colorbar.

  • area (xr.DataArray, optional) – Grid cell areas for computing weighted statistics.

  • show_stats (bool, optional) – Whether to show statistical information on the plot.

plot_climatology(data, var, plev=None, proj='robinson', proj_params={}, vmin=None, vmax=None, cbar_label=None)

Plots the climatology map for a given variable and time range.

Parameters:
  • data (xarray.Dataset) – Climatology dataset to plot.

  • var (str) – Variable name.

  • plev (float, optional) – Pressure level to plot (if applicable).

  • proj (string, optional) – Desired projection for the map.

  • proj_params (dict, optional) – Additional arguments for the projection (e.g., {‘central_longitude’: 0}).

  • vmin (float, optional) – Minimum color scale value.

  • vmax (float, optional) – Maximum color scale value.

  • cbar_label (str, optional) – Label for the colorbar.

Returns:

Matplotlib figure and axis objects.

Return type:

tuple

plot_seasonal_bias(data, data_ref, var, plev=None, proj='robinson', proj_params={}, vmin=None, vmax=None, cbar_label=None)

Plots seasonal biases for each season (DJF, MAM, JJA, SON).

Parameters:
  • data (xarray.Dataset) – Primary dataset.

  • data_ref (xarray.Dataset) – Reference dataset.

  • var (str) – Variable name.

  • plev (float, optional) – Pressure level.

  • proj (str, optional) – Desired projection for the map.

  • proj_params (dict, optional) – Additional arguments for the projection.

  • vmin (float, optional) – Minimum colorbar value.

  • vmax (float, optional) – Maximum colorbar value.

  • cbar_label (str, optional) – Label for the colorbar.

Returns:

The resulting figure.

Return type:

matplotlib.figure.Figure

plot_vertical_bias(data, data_ref, var, plev_min=None, plev_max=None, vmin=None, vmax=None, vmin_contour=None, vmax_contour=None, nlevels=18)

Calculates and plots the vertical bias between two datasets.

Parameters:
  • data (xarray.Dataset) – Dataset to analyze.

  • data_ref (xarray.Dataset) – Reference dataset for comparison.

  • var (str) – Variable name to analyze.

  • plev_min (float, optional) – Minimum pressure level.

  • plev_max (float, optional) – Maximum pressure level.

  • vmin (float, optional) – Minimum colorbar value.

  • vmax (float, optional) – Maximum colorbar value.

  • vmin_contour (float, optional) – Minimum contour value.

  • vmax_contour (float, optional) – Maximum contour value.

  • nlevels (int, optional) – Number of contour levels for the plot.

class aqua.diagnostics.PlotGregory(diagnostic_name: str = 'gregory', t2m_monthly_data=None, net_toa_monthly_data=None, t2m_annual_data=None, net_toa_annual_data=None, t2m_monthly_ref=None, net_toa_monthly_ref=None, t2m_annual_ref=None, net_toa_annual_ref=None, t2m_annual_std=None, net_toa_annual_std=None, loglevel: str = 'WARNING')

Bases: PlotBaseMixin

Initialize the class with the data to be plotted

Parameters:
  • t2m_monthly_data – List of monthly 2m temperature data

  • net_toa_monthly_data – List of monthly net toa data

  • t2m_annual_data – List of annual 2m temperature data

  • net_toa_annual_data – List of annual net toa data

  • t2m_monthy_ref – Monthly reference 2m temperature data

  • net_toa_monthy_ref – Monthly reference net toa data

  • t2m_annual_ref – Aannual reference 2m temperature data

  • net_toa_annual_ref – Annual reference net toa data

  • t2m_annual_std – Annual standard deviation of 2m temperature data

  • net_toa_annual_std – Annual standard deviation of net toa data

  • loglevel – Logging level. Default is ‘WARNING’

get_data_info()

We extract the data needed for labels, description etc from the data arrays attributes.

The attributes are: - AQUA_catalog - AQUA_model - AQUA_exp

plot(freq=['monthly', 'annual'], title: str = None, data_labels: list = None, ref_label: str = None, style: str = 'aqua')

Plot the data

Parameters:
  • freq – List of frequency for plotting. Default is [‘monthly’, ‘annual’]

  • title – Title of the plot. Default is None

  • data_labels – List of labels for the data. Default is None

  • ref_label – Label for the reference data. Default is None

  • style – Style of the plot. Default is ‘aqua’

plot_annual(fig: Figure, ax: Axes, data_labels: list = None)

Plot the annual data

Parameters:
  • fig – Figure object

  • ax – Axes object

  • data_labels – List of labels for the data. Default is None

Returns:

Figure object ax: Axes object

Return type:

fig

plot_monthly(fig: Figure, ax: Axes, data_labels: list = None, ref_label: str = None)

Plot the monthly data

Parameters:
  • fig – Figure object

  • ax – Axes object

  • data_labels – List of labels for the data. Default is None

  • ref_label – Label for the reference data. Default is None

Returns:

Figure object ax: Axes object

Return type:

fig

set_description()

Set the description for the plot

set_ref_label()

Set the reference label for the plot

set_title()

Set the title for the plot

class aqua.diagnostics.PlotHistogram(data=None, ref_data=None, diagnostic_name='histogram', density=True, loglevel: str = 'WARNING')

Bases: object

Class for plotting Histogram diagnostics. Provides methods to plot histogram/PDF data with customizable labels, titles, and styling options.

Initialize the PlotHistogram class.

Parameters:
  • data – List of histogram DataArrays to plot, or single DataArray.

  • ref_data – Reference histogram DataArray.

  • diagnostic_name (str) – Name of the diagnostic. Default is ‘histogram’.

  • density (bool) – Whether data represents PDF (True) or counts (False).

  • loglevel (str) – Logging level. Default is ‘WARNING’.

get_data_info()

Extract metadata from data arrays.

plot(data_labels=None, ref_label=None, title=None, style=None, xlogscale=False, ylogscale=True, xmax=None, xmin=None, ymax=None, ymin=None, smooth=False, smooth_window=5, labelsize=None)

Plot histogram data.

Parameters:
  • data_labels (list, optional) – Labels for the data.

  • ref_label (str, optional) – Label for the reference data.

  • title (str, optional) – Title for the plot.

  • style (str, optional) – Plotting style.

  • xlogscale (bool) – Use log scale for x-axis.

  • ylogscale (bool) – Use log scale for y-axis.

  • xmax (float, optional) – Maximum x value.

  • xmin (float, optional) – Minimum x value.

  • ymax (float, optional) – Maximum y value.

  • ymin (float, optional) – Minimum y value.

  • smooth (bool) – Apply smoothing to data.

  • smooth_window (int) – Window size for smoothing.

  • labelsize (int, optional) – Font size for labels.

Returns:

Matplotlib figure and axes objects.

Return type:

tuple

run(outputdir='./', rebuild=True, dpi=300, style=None, format: str | list = ['png', 'pdf', 'svg'], xlogscale=False, ylogscale=True, xmax=None, xmin=None, ymax=None, ymin=None, smooth=False, smooth_window=5, labelsize=None, show=False)

Run the complete plotting workflow.

Parameters:
  • outputdir (str) – Output directory to save the plot.

  • rebuild (bool) – If True, rebuild the plot even if it already exists.

  • dpi (int) – Dots per inch for the plot.

  • style (str) – Plotting style.

  • format (str or list) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • xlogscale (bool) – Use log scale for x-axis.

  • ylogscale (bool) – Use log scale for y-axis.

  • xmax (float, optional) – Maximum x value.

  • xmin (float, optional) – Minimum x value.

  • ymax (float, optional) – Maximum y value.

  • ymin (float, optional) – Minimum y value.

  • smooth (bool) – Apply smoothing to data.

  • smooth_window (int) – Window size for smoothing.

  • show (bool) – If True, display the plot interactively.

save_plot(fig, description: str = None, rebuild: bool = True, outputdir: str = './', dpi: int = 300, format: str | list = ['png', 'pdf', 'svg'])

Save the plot to a file.

Parameters:
  • fig (matplotlib.figure.Figure) – Figure object.

  • description (str) – Description of the plot.

  • rebuild (bool) – If True, rebuild the plot even if it already exists.

  • outputdir (str) – Output directory to save the plot.

  • dpi (int) – Dots per inch for the plot.

  • format (str or list) – Format(s) to save the figure. Default is SAVE_FORMAT.

set_data_labels()

Set the data labels for the plot.

set_description()

Set the description for the plot.

set_ref_label()

Set the reference label for the plot.

set_title()

Set the title for the plot.

class aqua.diagnostics.PlotHovmoller(data: list[Dataset], diagnostic_name: str = 'oceandrift', vert_coord: str = 'level', outputdir: str = '.', loglevel: str = 'WARNING')

Bases: object

Class for plotting Hovmoller diagrams and timeseries from AQUA ocean drift diagnostics.

This class provides methods to generate, customize, and save Hovmoller and timeseries plots using xarray datasets and AQUA conventions. It handles metadata extraction, plot styling, and output file management.

Initialize the PlotHovmoller class.

Parameters:
  • data (list[xr.Dataset]) – List of xarray datasets containing the data to be plotted

  • diagnostic_name (str) – Name of the diagnostic, default is “oceandrift”

  • vert_coord (str) – Name of the vertical dimension coordinate, default is “level”

  • outputdir (str) – Directory where the output will be saved, default is current directory

  • loglevel (str) – Logging level, default is “WARNING”

plot_hovmoller(rebuild: bool = True, save_format: list = ['png', 'pdf', 'svg'], dpi: int = 300)

Plot the Hovmoller diagram for the given data.

This method sets the title, description, vmax, vmin, and texts for the plot. It then calls the plot_multi_hovmoller function to create the plot and saves it using the OutputSaver.

Parameters:
  • rebuild (bool) – Whether to rebuild the output, default is True.

  • save_format (str or list) – List of formats to save the figure. Default is SAVE_FORMAT.

  • dpi (int) – Dots per inch for the saved figure. Default is 300.

Returns:

None

plot_timeseries(levels: list = None, rebuild: bool = True, save_format: list = ['png', 'pdf', 'svg'], dpi: int = 300)

Plot the timeseries for the given data.

This method sets the title, description, vmax, vmin, and texts for the plot. It then calls the plot_multi_timeseries function to create the plot and saves it using the OutputSaver.

Parameters:
  • levels (list, optional) – List of levels to plot. Default is None.

  • rebuild (bool) – Whether to rebuild the output, default is True.

  • save_format (list) – List of formats to save the figure. Default is SAVE_FORMAT.

  • dpi (int) – Dots per inch for the saved figure. Default is 300.

Returns:

None

save_plot(fig, diagnostic_product: str = None, extra_keys: dict = None, metadata: dict = None, rebuild: bool = True, dpi: int = 300, format: str = ['png', 'pdf', 'svg'])

Save the plot to a file.

Parameters:
  • fig (matplotlib.figure.Figure) – The figure to be saved.

  • diagnostic_product (str) – The name of the diagnostic product. Default is None.

  • extra_keys (dict) – Extra keys to be used for the filename (e.g. season). Default is None.

  • rebuild (bool) – If True, the output files will be rebuilt. Default is True.

  • dpi (int) – The dpi of the figure. Default is 300.

  • format (str or list) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • metadata (dict) – The metadata to be used for the figure. Default is None. They will be complemented with the metadata from the outputsaver. We usually want to add here the description of the figure.

set_data_for_levels()

Extract and set the data at the specified depth levels.

set_data_type()

Set the data type list for each dataset based on its AQUA attributes.

set_description(content: str = None)

Set the description for the Hovmoller plot.

set_levels()

Set the levels and corresponding labels for timeseries plots.

set_line_plot_colours()

Set the color list for line plots based on the number of levels.

set_suptitle(content: str = None)

Set the suptitle for the Hovmoller plot.

set_texts()

Set the text annotations for each subplot panel based on drift type.

set_title()

Set the title for each subplot panel.

set_vmax_vmin()

Set the vmax, vmin, and colormap for each subplot from the data type.

class aqua.diagnostics.PlotLatLonProfiles(data=None, ref_data=None, data_type='longterm', ref_std_data=None, diagnostic_name='lat_lon_profiles', loglevel: str = 'WARNING')

Bases: object

Class for plotting Lat-Lon Profiles diagnostics. This class provides methods to set data labels, description, title, and to plot the data. It handles data arrays regardless of their original temporal frequency, as temporal averaging is handled upstream.

Initialise the PlotLatLonProfiles class. This class is used to plot lat lon profiles data previously processed by the LatLonProfiles class.

Parameters:
  • data – Can be either: - List of temporally-averaged data arrays for annual plots - List of seasonal data [DJF, MAM, JJA, SON] for seasonal plots

  • ref_data – Reference data (structure matches data based on data_type)

  • data_type (str) – ‘longterm’ for single/multi-line longterm plots, ‘seasonal’ for 4-panel seasonal plots

  • ref_std_data – Reference standard deviation data

  • diagnostic_name (str) – Name of the diagnostic. Default is ‘lat_lon_profiles’.

  • loglevel (str) – Logging level. Default is ‘WARNING’.

Note

data_type determines how ‘data’ is interpreted: - ‘longterm’: data should be list of DataArrays for single plot - ‘seasonal’: data should be [DJF, MAM, JJA, SON] for 4-panel seasonal plots

get_data_info()

Extract metadata from data arrays based on data_type.

plot(data_labels=None, ref_label=None, title=None, style=None)

Unified plotting method that handles all plotting scenarios based on data_type.

Parameters:
  • data_labels (list, optional) – Labels for the data.

  • ref_label (str, optional) – Label for the reference data.

  • title (str, optional) – Title for the plot.

  • style (str, optional) – Plotting style. Default is the AQUA style.

Returns:

Matplotlib figure and axes objects.

Return type:

tuple

plot_seasonal_lines(data_labels=None, title=None, style=None)

Plot seasonal means using plot_seasonal_lat_lon_profiles. Creates a 4-panel plot with DJF, MAM, JJA, SON only.

Parameters:
  • data_labels (list) – List of data labels.

  • title (str) – Title of the plot.

  • style (str) – Plotting style. Default is the AQUA style

Returns:

Figure object. axs (list): List of axes objects.

Return type:

fig (matplotlib.figure.Figure)

run(outputdir='./', rebuild=True, dpi=300, style=None, format=['png', 'pdf', 'svg'], show=False)

Unified run method that handles all plotting scenarios.

Parameters:
  • outputdir (str) – Output directory to save the plot.

  • rebuild (bool) – If True, rebuild the plot even if it already exists.

  • dpi (int) – Dots per inch for the plot.

  • style (str) – Plotting style. Default is the AQUA style.

  • format (str) – Format of the plot. Default is SAVE_FORMAT.

  • show (bool) – If True, display the plot interactively.

save_plot(fig, description: str = None, rebuild: bool = True, outputdir: str = './', dpi: int = 300, format: str | list = ['png', 'pdf', 'svg'], diagnostic: str = None)

Save the plot to a file.

Parameters:
  • fig (matplotlib.figure.Figure) – Figure object.

  • description (str) – Description of the plot.

  • rebuild (bool) – If True, rebuild the plot even if it already exists.

  • outputdir (str) – Output directory to save the plot.

  • dpi (int) – Dots per inch for the plot.

  • format (str or list) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • diagnostic (str) – Diagnostic name to be used in the filename as diagnostic_product.

set_data_labels()

Set the data labels for the plot based on data_type.

set_description()

Set the caption for the plot.

set_ref_label()

Set the reference label for the plot. The label is extracted from the reference data array attributes.

Returns:

Reference label for the plot.

Return type:

ref_label (str)

set_title()

Set the title for the plot. Specialized for Lat-Lon Profiles diagnostic.

Returns:

Title for the plot.

Return type:

title (str)

class aqua.diagnostics.PlotMJO(data, outputdir: str = './', loglevel: str = 'WARNING')

Bases: object

PlotMJO class for plotting the MJO Hovmoller data. This class is a placeholder for future plotting methods.

Initialize the PlotMJO class.

Parameters:
  • data (xarray.DataArray) – Data to be plot.

  • outputdir (str) – Directory where the plots will be saved. Default is ‘./’.

  • loglevel (str) – Logging level. Default is ‘WARNING’.

plot_hovmoller(invert_axis: bool = True, invert_time: bool = True, nlevels: int = 21, cmap: str = 'PuOr', vmin: float = -90, vmax: float = 90)

Plot the Hovmoller diagram for the MJO data.

Parameters:
  • invert_axis (bool) – If True, invert the axis. Default is True.

  • invert_time (bool) – If True, invert the time axis. Default is True.

  • nlevels (int) – Number of contour levels. Default is 21.

  • cmap (str) – Colormap to use for the plot. Default is ‘PuOr’.

  • vmin (float) – Minimum value for the colorbar. Default is -90.

  • vmax (float) – Maximum value for the colorbar. Default is 90.

Returns:

The Hovmoller plot figure.

Return type:

fig (matplotlib.figure.Figure)

save_plot(fig, diagnostic_product: str = 'hovmoller', extra_keys: dict = None, rebuild: bool = True, metadata: dict = None, format: str | list = ['png', 'pdf', 'svg'], dpi: int = 300)

Save the plot to a file.

Parameters:
  • fig (matplotlib.figure.Figure) – The figure to be saved.

  • diagnostic_product (str) – The name of the diagnostic product. Default is None.

  • extra_keys (dict) – Extra keys to be used for the filename (e.g. season). Default is None.

  • rebuild (bool) – If True, the output files will be rebuilt. Default is True.

  • dpi (int) – The dpi of the figure. Default is 300.

  • format (str or list) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • metadata (dict) – The metadata to be used for the figure. Default is None. They will be complemented with the metadata from the outputsaver. We usually want to add here the description of the figure.

class aqua.diagnostics.PlotMLD(data: Dataset, obs: Dataset = None, diagnostic_name: str = 'ocean_stratification', outputdir: str = '.', loglevel: str = 'WARNING')

Bases: object

Class for plotting Mixed Layer Depth (MLD) maps.

Class to plot Mixed Layer Depth (MLD) maps.

Parameters:
  • data (xr.Dataset) – Dataset containing the MLD data to be plotted.

  • obs (xr.Dataset, optional) – Dataset containing observational MLD data for comparison. Default is None.

  • clim_time (str, optional) – Climatological time period for the data. Default is “January”.

  • diagnostic_name (str, optional) – Name of the diagnostic. Default is “ocean_stratification”.

  • outputdir (str, optional) – Directory to save the output plots. Default is the current directory.

  • loglevel (str, optional) – Logging level. Default is “WARNING”.

plot_mld(rebuild: bool = True, region: str = None, proj_name: str = 'PlateCarree', extent: list = None, save_format: str | list = ['png', 'pdf', 'svg'], dpi: int = 300)

Generate and save the MLD map plot.

Parameters:
  • rebuild (bool, optional) – If True, rebuild existing output files. Default is True.

  • region (str, optional) – Region name to override the dataset’s default. Default is None.

  • proj_name (str, optional) – Cartopy projection name. Default is “PlateCarree”.

  • extent (list, optional) – Map extent as [lonmin, lonmax, latmin, latmax]. Default is None.

  • save_format (str or list, optional) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • dpi (int, optional) – Resolution of the saved figure. Default is 300.

save_plot(fig, diagnostic_product: str = None, extra_keys: dict = None, rebuild: bool = True, dpi: int = 300, format: str = ['png', 'pdf', 'svg'], metadata: dict = None)

Save the plot to a file.

Parameters:
  • fig (matplotlib.figure.Figure) – The figure to be saved.

  • diagnostic_product (str) – The name of the diagnostic product. Default is None.

  • extra_keys (dict) – Extra keys to be used for the filename (e.g. season). Default is None.

  • rebuild (bool) – If True, the output files will be rebuilt. Default is True.

  • dpi (int) – The dpi of the figure. Default is 300.

  • format (str or list) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • metadata (dict) – The metadata to be used for the figure. Default is None. They will be complemented with the metadata from the outputsaver. We usually want to add here the description of the figure.

set_cbar_labels(var: str = None)

Set the colorbar label for the given variable.

Parameters:

var (str, optional) – Variable name to derive the colorbar label from.

set_cbar_limits()

Set the colorbar limits and number of levels for MLD plots.

set_central_lat_lon()

Set the central latitude and longitude for the map projection.

set_convert_lon(data=None)

Convert longitude from 0-360 to -180 to 180 and sort accordingly.

set_data_map_list()

Build the list of DataArrays to be plotted as maps.

set_description()

Build the figure description string including model and observation date ranges.

set_extent()

Set the map extent based on the data’s coordinate range and region.

set_figsize()

Set the figure size based on the number of rows and columns.

set_nrowcol()

Set the number of rows and columns for the subplot grid.

set_proj(proj_name: str = 'PlateCarree')

Set the Cartopy map projection.

Parameters:

proj_name (str, optional) – Projection name (‘PlateCarree’ or ‘Orthographic’). Default is ‘PlateCarree’.

set_suptitle(plot_type=None)

Set the title for the MLD plot.

set_title()

Set the title for each subplot panel.

set_ytext()

Set the y-axis text labels for each subplot.

class aqua.diagnostics.PlotNAO(indexes=None, ref_indexes=None, outputdir: str = './', rebuild: bool = True, loglevel: str = 'WARNING')

Bases: PlotBaseMixin

Plot the NAO products.

Parameters:
  • indexes (list) – List of indexes to plot.

  • ref_indexes (list) – List of reference indexes to plot.

  • outputdir (str) – Directory to save the plots. Default is ‘./’.

  • rebuild (bool) – If True, rebuild the plots. Default is True.

  • loglevel (str) – Log level for the logger. Default is ‘WARNING’.

plot_index(thresh: float = 0.0)
plot_maps(maps=None, ref_maps=None, statistic: str = None, vmin: float = None, vmax: float = None, vmin_diff: float = None, vmax_diff: float = None, **kwargs)

Plot the maps for the NAO products.

Parameters:
  • maps (list) – List of maps to plot.

  • ref_maps (list) – List of reference maps to plot.

  • statistic (str) – Statistic to plot. Default is None.

  • vmin (float) – Minimum value for the color value. Default is None.

  • vmax (float) – Maximum value for the color value. Default is None.

  • vmin_diff (float) – Minimum value for the color value for the difference. Default is None.

  • vmax_diff (float) – Maximum value for the color value for the difference. Default is None.

  • **kwargs – Additional arguments for the plotting function.

Returns:

Figure object.

Return type:

fig

set_index_description()

Build the description for the NAO index plot. Adds information about the months window used to compute the index.

set_map_description(maps=None, ref_maps=None, statistic: str = None)

Set the description for the maps.

Parameters:
  • maps (list) – List of maps to plot.

  • ref_maps (list) – List of reference maps to plot.

  • statistic (str) – Statistic to plot. Default is None.

Returns:

Description of the maps.

Return type:

str

class aqua.diagnostics.PlotSeaIce(monthly_models=None, annual_models=None, monthly_ref=None, annual_ref=None, monthly_std_ref: str = None, annual_std_ref: str = None, model: str = None, exp: str = None, source: str = None, catalog: str = None, regions_to_plot: list = ['Arctic', 'Antarctic'], outputdir='./', rebuild=True, filename_keys=None, dpi=300, loglevel='WARNING')

Bases: object

A class for processing and visualizing timeseries of integrated sea ice extent or volume. It is designed to work with AQUA-computed outputs (from the SeaIce diagnostic) repacking them into a unified format for easy comparison, labeling, and plotting.

Parameters:
  • monthly_models (xr.Dataset | list[xr.Dataset] | None, optional) – Monthly model datasets to be processed. Defaults to None.

  • annual_models (xr.Dataset | list[xr.Dataset] | None, optional) – Annual model datasets to be processed. Defaults to None.

  • monthly_ref (xr.Dataset | list[xr.Dataset] | None, optional) – Monthly reference datasets for comparison. Defaults to None.

  • annual_ref (xr.Dataset | list[xr.Dataset] | None, optional) – Annual reference datasets for comparison. Defaults to None.

  • monthly_std_ref (str, optional) – Monthly standard deviation reference dataset identifier. Defaults to None.

  • annual_std_ref (str, optional) – Annual standard deviation reference dataset identifier. Defaults to None.

  • model (str, optional) – Name of the model associated with the dataset. Defaults to None.

  • exp (str, optional) – Experiment name related to the dataset. Defaults to None.

  • source (str, optional) – Source of the dataset. Defaults to None.

  • catalog (str, optional) – Catalog name of the dataset. Defaults to None.

  • regions_to_plot (list, optional) – List of region names to be plotted (e.g., [‘arctic’, ‘antarctic’]). If None, all available regions are plotted. Defaults to None.

  • outputdir (str, optional) – Directory to save output plots. Defaults to ‘./’.

  • rebuild (bool, optional) – Whether to rebuild (overwrite) figure outputs if they already exist. Defaults to True.

  • exists. ((overwrite) figure outputs if) – List of keys to include in the output filenames. If None, all keys are included. Defaults to None.

  • dpi (int, optional) – Resolution of saved figures (dots per inch). Defaults to 300.

  • loglevel (str, optional) – Logging level for debugging and information messages. Defaults to ‘WARNING’.

plot_seaice(plot_type: str = 'timeseries', save_format: str | list = ['png', 'pdf', 'svg'], style: str = None, **kwargs)

Plot sea ice data for each region, either as timeseries or seasonal cycle.

Parameters:
  • plot_type (str, optional) – Type of plot to generate. Options are ‘timeseries’ or ‘seasonalcycle’. Default is ‘timeseries’.

  • save_format (str or list, optional) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • style (str, optional) – Override the plotting style. Default to None (which will get the style from config file or fallback to’aqua’).

  • **kwargs – Additional keyword arguments passed to the region-specific plotting function.

regions_type_plotter(region_dict, style, **kwargs)

Loops over each region in region_dict and plots data either as a timeseries or a seasonal cycle depending on plot_type attribute.

Parameters:
  • region_dict (dict) – Dictionary of regions and their associated data.

  • style (str) – Graphic style of the plot.

  • **kwargs (dict) – Additional keyword arguments passed on to the underlying plotting function.

Returns:

tuple. The figure and axes objects.

Return type:

(fig, axes)

repack_datasetlists(**kwargs) dict

Repack input datasets into a nested dictionary organized by method and region.

The output dictionary is structured as:

{ method: { region: { str_data: [list of data arrays] }}}

where:

  • ‘method’ is extracted from the dataset attributes (defaulting to “Unknown”).

  • ‘region’ is determined by self._get_region(dataset, data_var).

  • ‘str_data’ is the keyword with the data in input, and each value is a list of data arrays corresponding to that keyword.

Parameters:

**kwargs (dict) – Keyword arguments, where each str_data is linked to the kwargs in plot_timeseries() and each value is a list of xr.Dataset objects.

Returns:

A nested dict containing the repacked data arrays.

Return type:

dict

save_fig(fig, save_format: str | list = ['png', 'pdf', 'svg'], metadata: dict = None, region_dict: dict = None)

Save a matplotlib figure in the specified format(s) with associated metadata.

Parameters:
  • fig (matplotlib.figure.Figure) – The figure object to be saved.

  • save_format (str or list, optional) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • metadata (dict, optional) – Metadata such as description to be saved. Defaults to None.

  • region_dict (dict, optional) – Dictionary of regions plotted. Used to generate output filename. Defaults to None.

class aqua.diagnostics.PlotSeasonalCycles(diagnostic_name: str = 'seasonalcycles', monthly_data=None, ref_monthly_data=None, std_monthly_data=None, loglevel: str = 'WARNING')

Bases: PlotBaseMixin

Initialize the PlotSeasonalCycles class. This class is used to plot seasonal cycles data previously processed by the SeasonalCycles class.

Parameters:
  • diagnostic_name (str) – The name of the diagnostic. Used for logger and filenames. Default is ‘seasonalcycles’.

  • monthly_data (list) – List of monthly data arrays.

  • ref_monthly_data (xr.DataArray) – Reference monthly data array.

  • std_monthly_data (xr.DataArray) – Standard deviation monthly data array.

  • loglevel (str) – Logging level. Default is ‘WARNING’.

get_data_info()

We extract the data needed for labels, description etc from the data arrays attributes.

The attributes are: - AQUA_catalog - AQUA_model - AQUA_exp - AQUA_startdate - AQUA_enddate - AQUA_std_startdate - AQUA_std_enddate - AQUA_region - short_name - long_name - units

plot_seasonalcycles(data_labels=None, ref_label=None, title=None)

Plot the seasonal cycle using the plot_seasonalcycle function.

Parameters:
  • data_labels (list) – List of data labels.

  • ref_label (str) – Reference label.

  • title (str) – Title of the plot.

Returns:

Figure object. ax (matplotlib.axes.Axes): Axes object.

Return type:

fig (matplotlib.figure.Figure)

run(outputdir: str = './', rebuild: bool = True, dpi: int = 300, format: str | list = ['png', 'pdf', 'svg'])

Run the PlotTimeseries class.

Parameters:
  • outputdir (str) – Output directory to save the plot.

  • rebuild (bool) – If True, rebuild the plot even if it already exists.

  • dpi (int) – Dots per inch for the plot.

  • format (str or list) – Format(s) to save the figure. Default is SAVE_FORMAT.

save_plot(fig, description: str | None = None, rebuild: bool = True, outputdir: str = './', dpi: int = 300, format: str | list = ['png', 'pdf', 'svg'])

Save the plot to a file.

Parameters:
  • fig (matplotlib.figure.Figure) – Figure object.

  • description (str, optional) – Description of the plot.

  • rebuild (bool) – If True, rebuild the plot even if it already exists.

  • outputdir (str) – Output directory to save the plot.

  • dpi (int) – Dots per inch for the plot.

  • format (str or list) – Format(s) to save the figure. Default is SAVE_FORMAT.

set_description()

Set the caption for the plot. The caption is extracted from the data arrays attributes and the reference data arrays attributes. The caption is stored as ‘Description’ in the metadata dictionary.

Returns:

Caption for the plot.

Return type:

description (str)

set_title()

Set the title for the plot.

Returns:

Title for the plot.

Return type:

title (str)

class aqua.diagnostics.PlotStratification(data: Dataset, obs: Dataset = None, diagnostic_name: str = 'ocean_stratification', vert_coord: str = 'level', outputdir: str = '.', loglevel: str = 'WARNING')

Bases: object

Class for plotting ocean stratification vertical profiles.

Initialize PlotStratification with model and observational datasets.

Parameters:
  • data (xr.Dataset) – Dataset containing stratification variables to plot.

  • obs (xr.Dataset, optional) – Observational dataset for comparison. Default is None.

  • diagnostic_name (str, optional) – Name of the diagnostic. Default is “ocean_stratification”.

  • vert_coord (str, optional) – Vertical coordinate name. Default is DEFAULT_OCEAN_VERT_COORD.

  • outputdir (str, optional) – Directory to save output plots. Default is “.”.

  • loglevel (str, optional) – Logging level. Default is “WARNING”.

plot_stratification(rebuild: bool = True, save_format: str | list = ['png', 'pdf', 'svg'], dpi: int = 300)

Generate and save the stratification vertical profile plot.

Parameters:
  • rebuild (bool, optional) – If True, rebuild existing output files. Default is True.

  • save_format (str or list, optional) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • dpi (int, optional) – Resolution of the saved figure. Default is 300.

save_plot(fig, diagnostic_product: str = None, extra_keys: dict = None, rebuild: bool = True, metadata: dict = None, dpi: int = 300, format: str | list = ['png', 'pdf', 'svg'])

Save the plot to a file.

Parameters:
  • fig (matplotlib.figure.Figure) – The figure to be saved.

  • diagnostic_product (str) – The name of the diagnostic product. Default is None.

  • extra_keys (dict) – Extra keys to be used for the filename (e.g. season). Default is None.

  • rebuild (bool) – If True, the output files will be rebuilt. Default is True.

  • dpi (int) – The dpi of the figure. Default is 300.

  • format (str or list) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • metadata (dict) – The metadata to be used for the figure. Default is None. They will be complemented with the metadata from the outputsaver. We usually want to add here the description of the figure.

set_cbar_labels(var: str = None)

Set the colorbar label for the given variable.

Parameters:

var (str, optional) – Variable name to derive the colorbar label from.

set_cbar_limits()

Set the colorbar limits and number of levels for MLD plots.

set_data_list()

Populate the data and reference data lists for plotting.

set_description()

Build the figure description string including model and observation date ranges.

set_label_line_plot()

Set the legend labels for the model and observation lines.

set_nrowcol()

Set the number of rows and columns for the subplot grid.

set_suptitle()

Set the title for the MLD plot.

set_title()

Set the title for each subplot panel.

set_ytext()

Set the y-axis text labels for each subplot.

class aqua.diagnostics.PlotTimeseries(diagnostic_name: str = 'timeseries', hourly_data=None, daily_data=None, monthly_data=None, annual_data=None, ref_hourly_data=None, ref_daily_data=None, ref_monthly_data=None, ref_annual_data=None, std_hourly_data=None, std_daily_data=None, std_monthly_data=None, std_annual_data=None, loglevel: str = 'WARNING')

Bases: PlotBaseMixin

Class to plot time series data.

Initialize the PlotTimeseries class. This class is used to plot time series data previously processed by the Timeseries class.

Any subset of frequency can be provided, however the order and length of the list of data arrays must be the same for each frequency.

Note: Currently, only monthly and annual data are supported. Additionally, only one reference data array is supported for each frequency.

Parameters:
  • diagnostic_name (str) – The name of the diagnostic. Used for logger and filenames. Default is ‘timeseries’.

  • hourly_data (list) – List of hourly data arrays.

  • daily_data (list) – List of daily data arrays.

  • monthly_data (list) – List of monthly data arrays.

  • annual_data (list) – List of annual data arrays.

  • ref_hourly_data (xr.DataArray) – Reference hourly data array.

  • ref_daily_data (xr.DataArray) – Reference daily data array.

  • ref_monthly_data (xr.DataArray) – Reference monthly data array.

  • ref_annual_data (xr.DataArray) – Reference annual data array.

  • std_hourly_data (xr.DataArray) – Standard deviation hourly data array.

  • std_daily_data (xr.DataArray) – Standard deviation daily data array.

  • std_monthly_data (xr.DataArray) – Standard deviation monthly data array.

  • std_annual_data (xr.DataArray) – Standard deviation annual data array.

  • loglevel (str) – Logging level. Default is ‘WARNING’.

get_data_info()

We extract the data needed for labels, description etc from the data arrays attributes.

The attributes are: - AQUA_catalog - AQUA_model - AQUA_exp - AQUA_region - AQUA_startdate - AQUA_enddate - AQUA_std_startdate - AQUA_std_enddate - short_name - long_name - units

plot_timeseries(data_labels=None, ref_label=None, title=None)

Plot the time series data.

Parameters:
  • data_labels (list) – List of data labels.

  • ref_label (str) – Reference label.

  • title (str) – Title of the plot.

Returns:

Figure object. ax (matplotlib.axes.Axes): Axes object.

Return type:

fig (matplotlib.figure.Figure)

run(outputdir: str = './', rebuild: bool = True, dpi: int = 300, format: str = 'png')

Run the PlotTimeseries class.

Parameters:
  • outputdir (str) – Output directory to save the plot.

  • rebuild (bool) – If True, rebuild the plot even if it already exists.

  • dpi (int) – Dots per inch for the plot.

  • format (str) – Format of the plot (‘png’ or ‘pdf’). Default is ‘png’.

save_plot(fig, description: str = None, rebuild: bool = True, outputdir: str = './', dpi: int = 300, format: str = 'png')

Save the plot to a file.

Parameters:
  • fig (matplotlib.figure.Figure) – Figure object.

  • description (str) – Description of the plot.

  • rebuild (bool) – If True, rebuild the plot even if it already exists.

  • outputdir (str) – Output directory to save the plot.

  • dpi (int) – Dots per inch for the plot.

  • format (str) – Format of the plot (‘png’ or ‘pdf’). Default is ‘png’.

set_description()

Set the caption for the plot. The caption is extracted from the data arrays attributes and the reference data arrays attributes. The caption is stored as ‘Description’ in the metadata dictionary.

Returns:

Caption for the plot.

Return type:

description (str)

set_title()

Set the title for the plot.

Returns:

Title for the plot.

Return type:

title (str)

class aqua.diagnostics.PlotTrends(data: Dataset, diagnostic_name: str = 'trends', vert_coord: str = 'level', outputdir: str = '.', rebuild: bool = True, loglevel: str = 'WARNING')

Bases: object

Class for plotting ocean trend diagnostics from xarray Datasets.

Class to plot trends from xarray Dataset.

Parameters:
  • data (xr.Dataset) – Input xarray Dataset containing trend data.

  • diagnostic_name (str, optional) – Name of the diagnostic for filenames. Defaults to “trends”.

  • vert_coord (str, optional) – Name of the vertical dimension coordinate. Defaults to DEFAULT_OCEAN_VERT_COORD.

  • outputdir (str, optional) – Directory to save output plots. Defaults to “.”.

  • rebuild (bool, optional) – Whether to rebuild output files. Defaults to True.

  • loglevel (str, optional) – Logging level. Default is “WARNING”.

plot_multilevel(levels: list = None, rebuild: bool = True, cbar_limits: dict = None, sym: bool = False, save_format: str | list = ['png', 'pdf', 'svg'], dpi: int = 300)

Plot multi-level maps of trends.

Parameters:
  • levels (list, optional) – List of depth levels to plot. Defaults to None.

  • rebuild (bool, optional) – If True, rebuild existing output files. Defaults to True.

  • cbar_limits (dict, optional) – Per-variable colorbar limits as {var: {‘vmin’: v, ‘vmax’: v}}. Defaults to None.

  • sym (bool, optional) – If True, use symmetric colorbar limits. Defaults to False.

  • save_format (str or list, optional) – Format(s) to save the figure in. Defaults to SAVE_FORMAT.

  • dpi (int, optional) – Resolution of the saved figure. Defaults to 300.

plot_zonal(rebuild: bool = True, save_format: str | list = ['png', 'pdf', 'svg'], dpi: int = 300)

Plot zonal mean vertical profiles of trends.

Parameters:
  • rebuild (bool, optional) – If True, rebuild existing output files. Defaults to True.

  • save_format (str or list, optional) – Format(s) to save the figure in. Defaults to SAVE_FORMAT.

  • dpi (int, optional) – Resolution of the saved figure. Defaults to 300.

save_plot(fig, diagnostic_product: str, extra_keys: dict = {}, rebuild: bool = True, dpi: int = 300, format: str | list = ['png', 'pdf', 'svg'], metadata: dict = None)

Save the plot to a file.

Parameters:
  • fig (matplotlib.figure.Figure) – The figure to be saved.

  • diagnostic_product (str) – The name of the diagnostic product. Default is None.

  • extra_keys (dict) – Extra keys to be used for the filename (e.g. season). Default is None.

  • rebuild (bool) – If True, the output files will be rebuilt. Default is True.

  • dpi (int) – The dpi of the figure. Default is 300.

  • format (str or list) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • metadata (dict) – The metadata to be used for the figure. Default is None. They will be complemented with the metadata from the outputsaver. We usually want to add here the description of the figure.

set_cbar_labels()

Set the colorbar labels for each subplot from variable units.

set_central_longitude()

Set the central longitude for the map projection from the data.

set_data_list()

Prepare the list of data arrays to plot.

set_description(content=None)

Set the description metadata for the plot.

set_extent()

Set the extent for the plot.

set_nrowcol()

Set the number of rows and columns for the subplot grid.

set_suptitle(plot_type=None)

Set the title for the plot.

set_title()

Set the title for each subplot panel.

set_vmin_vmax()

Set per-variable colorbar min/max from cbar_limits if provided.

set_ytext()

Set the y-axis text for the multi-level plots.

class aqua.diagnostics.SeaIce(model: str, exp: str, source: str, catalog=None, regrid=None, startdate=None, enddate=None, std_startdate=None, std_enddate=None, threshold=0.15, regions=['arctic', 'antarctic'], regions_file=None, outputdir: str = './', loglevel: str = 'WARNING')

Bases: Diagnostic

Sea ice diagnostic class for computing and analyzing sea ice metrics.

This class provides methods to compute sea ice extent (million km²), volume (thousand km³), fraction (dimensionless, 1) and thickness (m) over specified regions (e.g., Arctic, Antarctic). It supports both time series (integrated), with options for computing standard deviations, seasonal cycles, and 2D monthly climatologies.

Parameters:
  • model (str) – The model name.

  • exp (str) – The experiment name.

  • source (str) – The data source.

  • catalog (str, optional) – The catalog name.

  • regrid (str, optional) – The regrid option.

  • startdate (str, optional) – The start date for the data (format: “YYYY-MM-DD”).

  • enddate (str, optional) – The end date for the data (format: “YYYY-MM-DD”).

  • std_startdate (str, optional) – Start date for standard deviation.

  • std_enddate (str, optional) – End date for standard deviation.

  • threshold (float, optional) – Threshold for sea ice concentration over extent (default: 0.15; 15% conc).

  • regions (list, optional) – A list of regions to analyze. Default: [‘arctic’, ‘antarctic’].

  • regions_file (str, optional) – Path to YAML file defining regions definition file.

  • outputdir (str, optional) – The output directory (default: ‘./’).

  • regions_definition (dict) – The loaded regions definition from the YAML file.

  • loglevel (str, optional) – The logging level. Defaults to ‘WARNING’.

Initialize the diagnostic class. This is a general purpose class that can be used by the diagnostic classes to retrieve data from a single model and to save the data to a netcdf file. It is not a working diagnostic class by itself.

Parameters:
  • model (str) – The model to be used.

  • exp (str) – The experiment to be used.

  • source (str) – The source to be used.

  • catalog (str) – The catalog to be used. If None, the catalog will be determined by the Reader.

  • regrid (str | None) – The target grid to be used for regridding. If None, no regridding will be done.

  • startdate (str | None) – The start date of the plot/analysis period. If None, all available data will be used.

  • enddate (str | None) – The end date of the plot/analysis period. If None, all available data will be used.

  • std_startdate (str | None) – The start date of the standard deviation period. If None, no std period is tracked at the Diagnostic level.

  • std_enddate (str | None) – The end date of the standard deviation period. If None, no std period is tracked at the Diagnostic level.

  • loglevel (str) – The log level to be used. Default is ‘WARNING’.

MINIMUM_MONTHS_REQUIRED = 1
add_seaice_attrs(da_seaice_computed: DataArray, region: str, startdate: str = None, enddate: str = None, std_flag=False)

Adds metadata attributes to a computed sea ice DataArray. This function assigns descriptive attributes to an xr.DataArray representing computed sea ice (extent or volume) for a specific region and time period.

Parameters:
  • da_seaice_computed (xr.DataArray) – The computed sea ice data to which attributes will be added.

  • region (str) – The geographical region over which sea ice data is computed.

  • startdate (str, optional) – The start date of the data (format “%Y-%m”). Default to None.

  • enddate (str, optional) – The end date of the data (format “%Y-%m”). Default to None.

  • std_flag (bool, optional) – If True, add the std computation as AQUA_std_startdate and AQUA_std_enddate. Defaults to False.

Returns:

xr.DataArray

compute_seaice(method: str = 'extent', var: str = None, *args, **kwargs)

Execute the seaice diagnostic based on the specified method.

Parameters:
  • var (str) – The variable to be used for computation. Default is ‘sithick’ or ‘siconc’.

  • method (str) – The method to compute sea ice metrics. Options are ‘extent’ or ‘volume’.

Kwargs:
  • threshold (float): The threshold value for which sea ice fraction is considered. Default is 0.15.

  • reader_kwargs (dict, optional): Additional keyword arguments to pass to the Reader.

Returns:

The computed sea ice metric. A Dataset is returned if multiple regions are requested.

Return type:

xr.DataArray or xr.Dataset

get_area_cells_and_coords(masked_data: DataArray)

Get areacello and space coordinates

Parameters:

masked_data (xr.DataArray) – The masked data to be checked if it is regridded or not

Returns:

The area grid cells (m^2).

Return type:

xr.DataArray

integrate_seaice(masked_data, region: str)

Integrate the masked data over the spatial dimension to compute sea ice metrics. If method is extent / volume, divide by 1e12 to convert to million km^2 / thousand km^3.

Parameters:
  • masked_data (xr.DataArray) – The masked data to be integrated.

  • region (str) – The region for which the sea ice metric is computed.

Returns:

The computed sea ice metric.

Return type:

xr.DataArray

load_regions(regions_file=None, regions=None)

Loads region definitions from a .yaml configuration file and sets the selected regions.

Parameters:
  • regions_file (str) – Full path to the region file. If None, a default path is used.

  • regions (str or list of str) – A region or list of region names to load. If None, all regions from the configuration are used.

save_netcdf(seaice_data, diagnostic: str, diagnostic_product: str = None, rebuild: bool = True, output_file: str = None, **kwargs)

Save the computed sea ice data to a NetCDF file.

Parameters:
  • seaice_data (xr.DataArray or xr.Dataset) – The computed sea ice metric data.

  • diagnostic (str) – The diagnostic name. It is expected ‘SeaIce’ for this class.

  • diagnostic_product (str, optional) – The diagnostic product. Can be used for namig the file more freely.

  • rebuild (bool, optional) – If True, rebuild (overwrite) the NetCDF file. Default is True.

  • output_file (str, optional) – The output file name.

  • **kwargs – Additional keyword arguments for saving the data.

select_region_area_cell(areacello: DataArray, region: str, drop: bool = True)

Select the area cells for a specific region based on the region definition.

Parameters:
  • areacello (xr.DataArray) – The area cells DataArray.

  • region (str) – The region for which to select the area cells.

Returns:

The area cells DataArray filtered by the region coordinates.

Return type:

xr.DataArray

class aqua.diagnostics.SeasonalCycles(diagnostic_name: str = 'seasonalcycles', catalog: str = None, model: str = None, exp: str = None, source: str = None, regrid: str = None, startdate: str = None, enddate: str = None, std_startdate: str = None, std_enddate: str = None, region: str = None, lon_limits: list = None, lat_limits: list = None, loglevel: str = 'WARNING')

Bases: BaseMixin

SeasonalCycles class for retrieve and netcdf saving of a single experiment

Initialize the Timeseries class.

Parameters:
  • diagnostic_name (str) – The name of the diagnostic. Used for logger and filenames. Default is ‘seasonalcycles’.

  • catalog (str) – The catalog to be used. If None, the catalog will be determined by the Reader.

  • model (str) – The model to be used.

  • exp (str) – The experiment to be used.

  • source (str) – The source to be used.

  • regrid (str) – The target grid to be used for regridding. If None, no regridding will be done.

  • startdate (str) – The start date of the data to be retrieved. If None, all available data will be retrieved.

  • enddate (str) – The end date of the data to be retrieved. If None, all available data will be retrieved.

  • std_startdate (str) – The start date of the standard deviation evaluation period.

  • std_enddate (str) – The end date of the standard deviation evaluation period.

  • region (str) – The region to select. This will define the lon and lat limits.

  • lon_limits (list) – The longitude limits to be used. Overriden by region.

  • lat_limits (list) – The latitude limits to be used. Overriden by region.

  • loglevel (str) – The log level to be used. Default is ‘WARNING’.

MINIMUM_MONTHS_REQUIRED = 2
compute(exclude_incomplete: bool = True, center_time: bool = True, box_brd: bool = True)

Compute the seasonal cycles.

Parameters:
  • exclude_incomplete (bool) – If True, exclude incomplete periods.

  • center_time (bool) – If True, the time will be centered.

  • box_brd (bool) – choose if coordinates are comprised or not in area selection.

run(var: str, formula: bool = False, long_name: str = None, units: str = None, short_name: str = None, std: bool = False, exclude_incomplete: bool = True, center_time: bool = True, box_brd: bool = True, outputdir: str = './', rebuild: bool = True, reader_kwargs: dict = {}, create_catalog_entry: bool = False)

Run all the steps necessary for the computation of the SeasonalCyles. Save the results to netcdf files.

Parameters:
  • var (str) – The variable to be used.

  • formula (bool) – If True, the variable is a formula.

  • long_name (str) – The long name of the variable, if different from the variable name.

  • units (str) – The units of the variable, if different from the original units.

  • short_name (str) – The short name of the variable, if different from the variable name.

  • std (bool) – If True, compute the standard deviation. Default is False.

  • exclude_incomplete (bool) – If True, exclude incomplete periods.

  • center_time (bool) – If True, the time will be centered.

  • box_brd (bool) – choose if coordinates are comprised or not in area selection.

  • outputdir (str) – The directory to save the data.

  • rebuild (bool) – If True, rebuild the data.

  • reader_kwargs (dict) – Additional keyword arguments for the Reader. Default is an empty dictionary.

  • create_catalog_entry (bool) – If True, create a catalog entry for the data. Default is False.

class aqua.diagnostics.SshVariabilityCompute(diagnostic_name: str = 'sshVariability', catalog: str = None, model: str = None, exp: str = None, source: str = None, startdate: str = None, enddate: str = None, freq: str = None, region: str = None, regrid: str = None, lon_limits: list[float] = None, lat_limits: list[float] = None, var: str = 'zos', long_name: str = None, short_name: str = None, units: str = None, save_netcdf: bool = True, rebuild: bool = True, outputdir: str = './', reader_kwargs: dict = {}, loglevel: str = 'WARNING')

Bases: BaseMixin

SSH Computation

Initialize the ‘SshVariabilityCompute’ class.

This class is designed to load an xarray.Dataset and computes STD.

Parameters:
  • diagnostic_name (str) – Default is ‘sshVariability’.

  • catalog (str) – catalog. It is Mandatory, if ‘save_netcdf=True’.

  • model (str) – Name of the data

  • exp (str) – Name of the experiment

  • source (str) – the source. It is important to give these dates and input. Otherwise the whole dataset is retrieved.

  • startdate (str) – Start date.

  • enddate (str) – End date.

  • freq (str) – Frequency of the data. In the TODO list. This becomes important when implementing the ‘variance of the variances formula’.

  • region (str) – For subregion selection. Default is ‘None’. In case of sub-region STD computation, this variable is mandatory.

  • regrid (str) – Regrid option for the data. NOTE: the regridding will be applied before computing the STD.

  • None (If 'lon_limits' and 'lat_limits' are)

  • AQUA. (they are taken from region file in)

  • lon_limits (list[float]) – list of lon limits. Default is ‘None’.

  • lat_limits (list[float]) – list of lat limits. Default is ‘None’.

  • var (str) – Variable name for ssh data. Default is ‘zos’.

  • long_name (str) – If not given extracted from the data.

  • short_name (str) – If not given extracted from the data.

  • units (str) – If not given extracted from the data.

  • save_netcdf (bool) – Default is ‘True’.

  • rebuild (bool) – Recomputes and saves the netcdf. Default is “True”.

  • outputdir (str) – output directory. Default is ‘./’

  • loglevel (str) – Default WARNING.

Keyword Arguments:
  • zoom (int, optional) – HEALPix grid zoom level (e.g. zoom=10 is h1024). Allows for multiple gridname definitions.

  • realization (int, optional) – The ensemble realization number, included in the output filename.

  • **kwargs – Additional arbitrary keyword arguments to be passed as additional parameters to the intake catalog entry

run()
Parameters:

create_catalog_entry (bool) – Option for creating catalog entry. Default is ‘False’.

This function performs following three functions: a) Retrieve data and regrid if given then b) Compute STD c) Save netcdf

class aqua.diagnostics.SshVariabilityPlot(diagnostic_name='sshVariability', outputdir='./', loglevel='WARNING')

Bases: PlotBaseMixin

Plot sshVariability and the difference of sshVariability

Initialize the sshVariability.

Parameters:
  • diagnostic_name (str) – sshVariability

  • outputdir (str) – output directory

  • loglevel (str) – Default WARNING

plot(var=None, dataset_std=None, catalog=None, model=None, exp=None, startdate=None, enddate=None, plot_options={}, figsize: tuple = (11, 8.5), ax_pos: tuple = (1, 1, 1), vmin=None, vmax=None, gridlines=True, proj='robinson', proj_params={}, save_format=['png', 'pdf', 'svg'], dpi=600, region=None, lon_limits=None, lat_limits=None, mask_options={}, mask_northern_boundary=True, mask_southern_boundary=True, northern_boundary_latitude=70, southern_boundary_latitude=-62, diagnostic_product='sshVariability', rebuild: bool = True, description=None, tgt_grid_name='r1440x721', regrid_method='ycon')

Visualize the SSH variability.

Plot the variability of sea surface height (SSH) from an input dataset.

This function visualizes SSH variability using configurable spatial, temporal, and plotting options. It supports contou, regional selection, custom projections, masking, and output saving in multiple formats.

Parameters:
  • var (str, optional) – Variable name for SSH, e.g., 'zos'.

  • dataset_std (xarray.Dataset, optional) – Dataset containing the SSH field to be plotted.

  • catalog (str, optional) – Catalog name. Used in plot titles. (Mandatory for labeling)

  • model (str, optional) – Model or dataset name. Used in plot titles. (Mandatory for labeling)

  • exp (str, optional) – Experiment identifier. Used in plot titles. (Mandatory for labeling)

  • startdate (str, optional) – Start date label to include in the plot title.

  • enddate (str, optional) – End date label to include in the plot title.

  • regrid (str or dict, optional) – Regridding option or parameters for spatial interpolation.

  • plot_options (dict, optional) – Additional keyword arguments for customizing the plot (e.g., colormap, linewidth).

  • vmin (float, optional) – Minimum value for color scaling. If None, determined automatically.

  • vmax (float, optional) – Maximum value for color scaling. If None, determined automatically.

  • proj (str, optional) – Map projection type. Default is 'robinson'.

  • proj_params (dict, optional) – Additional keyword arguments passed to the projection.

  • save_format (str or list, optional) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • dpi (int, optional) – Resolution (dots per inch) for saved figures. Default is 300.

  • region (str, optional) – Region identifier. If provided, overrides lat/lon limits.

  • lon_limits (list[float], optional) – Longitude limits [min, max] for the plot.

  • lat_limits (list[float], optional) – Latitude limits [min, max] for the plot.

  • mask_options (dict, optional) – Options for masking grid cells (specific to ICON).

  • mask_northern_boundary (bool, optional) – If True, mask latitudes north of northern_boundary_latitude.

  • mask_southern_boundary (bool, optional) – If True, mask latitudes south of southern_boundary_latitude.

  • northern_boundary_latitude (float, optional) – Latitude above which data will be masked. Default is 70.

  • southern_boundary_latitude (float, optional) – Latitude below which data will be masked. Default is -62.

  • diagnostic_product (str, optional) – Diagnostic type, e.g., 'sshVariability'. Default is 'sshVariability'.

  • rebuild (bool, optional) – If True, rebuild the data from the original files. Default is True.

  • description (str, optional) – Additional description to include in the plot or metadata.

  • tgt_grid_name (str, optional) – Target grid name for regridding. Default is ‘r1440x720’.

  • regrid_method (str, optional) – Regridding method to use. Default is ‘ycon’.

Returns:

The generated plot figure object.

Return type:

matplotlib.figure.Figure

Raises:
  • ValueError – If required arguments (e.g., catalog, model, exp) are missing.

  • TypeError – If inputs are of invalid type (e.g., dataset not an xarray.Dataset).

plot_diff(var=None, dataset_std=None, catalog=None, model=None, exp=None, startdate=None, enddate=None, dataset_std_ref=None, catalog_ref=None, model_ref=None, exp_ref=None, startdate_ref=None, enddate_ref=None, figsize: tuple = (11, 8.5), ax_pos: tuple = (1, 1, 1), plot_options={}, vmin_diff=None, vmax_diff=None, gridlines=True, proj='robinson', proj_params={}, save_format=['png', 'pdf', 'svg'], dpi=600, region=None, lon_limits=None, lat_limits=None, mask_options={}, mask_northern_boundary=True, mask_southern_boundary=True, northern_boundary_latitude=70, southern_boundary_latitude=-62, diagnostic_product='sshVariability_Difference', description=None, rebuild: bool = True, tgt_grid_name='r1440x721', regrid_method='ycon')

Visualize the difference in sea surface height (SSH) variability between a model and a reference dataset.

This function generates a map of SSH variability differences using Cartopy projections, supporting custom contour, masking, regional selection, and configurable plotting options. The plot can be saved as PNG or PDF.

Parameters:
  • var (str, optional) – Variable name to plot (e.g., ‘zos’).

  • dataset_std (xarray.Dataset, optional) – Dataset of the model to be plotted.

  • catalog (str, optional) – Catalog name for the model dataset (used in plot title).

  • model (str, optional) – Model name of the dataset (used in plot title).

  • exp (str, optional) – Experiment name of the dataset (used in plot title).

  • startdate (str, optional) – Start date of the dataset for the plot title.

  • enddate (str, optional) – End date of the dataset for the plot title.

  • dataset_std_ref (xarray.Dataset, optional) – Reference dataset for comparison.

  • catalog_ref (str, optional) – Catalog name for the reference dataset.

  • model_ref (str, optional) – Model name of the reference dataset.

  • exp_ref (str, optional) – Experiment name of the reference dataset.

  • startdate_ref (str, optional) – Start date of the reference dataset.

  • enddate_ref (str, optional) – End date of the reference dataset.

  • regrid (str or dict, optional) – Regridding method or parameters.

  • plot_options (dict, optional) – Additional keyword arguments for plotting (e.g., colormap, alpha).

  • vmin_diff (float, optional) – Minimum value for color scaling. If None, determined automatically.

  • vmax_diff (float, optional) – Maximum value for color scaling. If None, determined automatically.

  • proj (str, optional) – Map projection. Default is ‘robinson’.

  • proj_params (dict, optional) – Additional keyword arguments for the projection.

  • save_format (str or list, optional) – Format(s) to save the figure. Default is SAVE_FORMAT.

  • dpi (int, optional) – Resolution of the saved figure. Default is 300.

  • region (str, optional) – Region identifier for the plot.

  • lon_limits (list[float], optional) – Longitude limits [min, max] for the plot.

  • lat_limits (list[float], optional) – Latitude limits [min, max] for the plot.

  • mask_options (dict, optional) – Options for masking (specific to ICON grids).

  • mask_northern_boundary (bool, optional) – Mask latitudes above northern_boundary_latitude. Default is True.

  • mask_southern_boundary (bool, optional) – Mask latitudes below southern_boundary_latitude. Default is True.

  • northern_boundary_latitude (float, optional) – Latitude above which data is masked. Default is 70.

  • southern_boundary_latitude (float, optional) – Latitude below which data is masked. Default is -62.

  • diagnostic_product (str, optional) – Diagnostic product identifier. Default is ‘sshVariability_Difference’.

  • description (str, optional) – Additional description for the plot metadata or title.

  • rebuild (bool, optional) – If True, rebuild the data from the original files. Default is True.

  • tgt_grid_name (str, optional) – Target grid name for regridding. Default is ‘r1440x720’.

  • regrid_method (str, optional) – Regridding method to use. Default is ‘ycon’.

Returns:

The generated figure object.

Return type:

matplotlib.figure.Figure

Raises:
  • ValueError – If required dataset or catalog/model/exp information is missing.

  • TypeError – If input datasets are not xarray.Datasets.

subregion_selection(data=None, model=None, exp=None, mask_northern_boundary=None, northern_boundary_latitude=None, mask_southern_boundary=None, southern_boundary_latitude=None, lon_lim=None, lat_lim=None, region_name=None)

Selecting sub-region based on lon-lat

class aqua.diagnostics.StatGlobalBiases(loglevel: str = 'WARNING')

Bases: object

Class for computing bias statistics between model and reference data. It works directly with xarray datasets and includes statistical significance testing.

Parameters:

loglevel (str) – Log level. Default is ‘WARNING’.

compute_bias_statistics(data: Dataset, data_ref: Dataset, var: str, area: DataArray = None) Dataset

Compute global mean bias and RMSE between model and reference data.

Parameters:
  • data (xr.Dataset) – Model climatology dataset.

  • data_ref (xr.Dataset) – Reference climatology dataset.

  • var (str) – Variable name.

  • area (xr.DataArray, optional) – Grid cell areas for weighted statistics. If None, unweighted statistics will be computed.

Returns:

Dataset containing mean bias and RMSE.

Return type:

xr.Dataset

compute_significance_ttest(data: Dataset, data_ref: Dataset, var: str, alpha: float = 0.05, min_samples: int = 3) DataArray

Compute statistical significance of bias using two-sample t-test.

Performs a two-sided t-test at each grid point to determine if the difference between model and reference data is statistically significant.

Parameters:
  • data (xr.Dataset) – Model dataset with time dimension.

  • data_ref (xr.Dataset) – Reference dataset with time dimension.

  • var (str) – Variable name.

  • alpha (float) – Significance level (default: 0.05 for 95% confidence).

  • min_samples (int) – Minimum number of samples required to perform test. Default is 3.

Returns:

Boolean array where True indicates statistically significant differences.

Same spatial dimensions as input data.

Return type:

xr.DataArray

compute_yearly_temporal_means(data: Dataset, var: str) DataArray

Compute yearly temporal means for a given variable. :param data: Input dataset with time dimension. :type data: xr.Dataset :param var: Variable name to compute means for. :type var: str

Returns:

Yearly temporal means of the variable.

Return type:

xr.DataArray

ttest_at_grid_point(model_vals, ref_vals, min_samples: int = 3)

Perform t-test at a single grid point. :param model_vals: 1D array of model values at a grid point. :type model_vals: np.ndarray :param ref_vals: 1D array of reference values at the same grid point. :type ref_vals: np.ndarray :param min_samples: Minimum number of samples required to perform the t-test. Default is 3. :type min_samples: int

Returns:

p-value from the t-test.

Return type:

float

class aqua.diagnostics.Stratification(catalog: str = None, model: str = None, exp: str = None, source: str = None, regrid: str = None, startdate: str = None, enddate: str = None, diagnostic_name: str = 'stratification', vert_coord: str = 'level', loglevel: str = 'WARNING')

Bases: Diagnostic

Diagnostic class for analyzing ocean stratification.

Parameters

catalogstr, optional

Path to the data catalog (e.g., intake-esm catalog).

modelstr, optional

Name of the climate model to analyze.

expstr, optional

Experiment name (e.g., ‘historical’, ‘ssp585’).

sourcestr, optional

Data source (e.g., ‘CMIP6’, ‘OBS’).

regridstr, optional

Regridding method or target grid (e.g., ‘1x1’, ‘nearest’).

startdatestr, optional

Start date of the analysis period (format: ‘YYYY-MM-DD’).

enddatestr, optional

End date of the analysis period (format: ‘YYYY-MM-DD’).

loglevelstr, optional

Logging level (default is “WARNING”).

Attributes

loggerlogging.Logger

Configured logger for the diagnostic.

Initialize the Stratification diagnostic.

Parameters

catalogstr, optional

Path to the data catalog.

modelstr, optional

Name of the climate model to analyze.

expstr, optional

Experiment name.

sourcestr, optional

Data source identifier.

regridstr, optional

Regridding method or target grid.

startdatestr, optional

Start date of the analysis period (format: ‘YYYY-MM-DD’).

enddatestr, optional

End date of the analysis period (format: ‘YYYY-MM-DD’).

diagnostic_namestr, optional

Name of the diagnostic (default: “stratification”).

vert_coordstr, optional

Vertical coordinate name (default: DEFAULT_OCEAN_VERT_COORD).

loglevelstr, optional

Logging level (default: “WARNING”).

MINIMUM_MONTHS_REQUIRED = 12
calculate_rho()

Convert variables to absolute salinity and conservative temperature, then compute potential density.

Updates the internal dataset with the computed potential density anomaly (‘rho’).

Returns

None

compute_climatology(climatology: str = 'season')

Compute climatology for the dataset based on the specified period type.

Depending on the value of self.climatology, the method will: - Group and average the data along the corresponding time accessor if self.climatology is not one of [“month”, “year”, “season”]. - Compute the overall mean across the time dimension if self.climatology is “total”.

Parameters

climatologystr, optional

Type of climatology to compute. Expected values: - “month” : Monthly climatology - “year” : Yearly climatology - “season” : Seasonal climatology - “total” : Mean over all available time steps - Other : Groups data by time.<self.climatology> and averages Default is “season”.

Returns

None

compute_mld()

Compute the mixed layer depth (MLD) from the density field.

Uses the potential density anomaly (‘rho’) in the dataset to compute MLD and adds it as ‘mld’.

Returns

None

compute_stratification()

Compute the stratification by calculating climatology and density.

This method first computes the climatology (default: seasonal) and then computes the potential density. Updates the internal dataset with the results.

Returns

None

run(outputdir: str = '.', rebuild: bool = True, region: str = None, var: list = ['thetao', 'so'], dim_mean=None, climatology: str = 'month', reader_kwargs: dict = {}, mld: bool = False)

Run the stratification diagnostic workflow.

This method orchestrates the complete diagnostic process: 1. Reads the required variables from the input source. 2. Optionally selects a specified region. 3. Optionally computes mean values over given dimensions. 4. Computes stratification by generating climatology and potential density. 5. Optionally computes mixed layer depth (MLD). 6. Saves the processed dataset to a NetCDF file.

Parameters

outputdirstr, optional

Directory where the output NetCDF file will be saved. Default is the current directory (” . “).

rebuildbool, optional

If True, overwrite the existing output file. Default is True.

regionstr, optional

Name of the region to select for analysis. If None, no region selection is applied.

varlist of str, optional

Names of variables to retrieve. Default is [“thetao”, “so”].

dim_meanlist of str or str, optional

Dimensions over which to average the data. If None, no averaging is applied.

climatologystr, optional

Type of climatology to compute (“month”, “year”, “season”, “total”). Default is “month”.

reader_kwargsdict, optional

Additional keyword arguments passed to the data reader.

mldbool, optional

If True, compute mixed layer depth (MLD) and include it in the output.

Returns

None

save_netcdf(data, diagnostic: str = 'ocean_circulation', diagnostic_product: str = 'stratification', region: str = None, outputdir: str = '.', rebuild: bool = True)

Save the diagnostic output to a NetCDF file.

Parameters

dataxarray.Dataset or xarray.DataArray

The dataset or data array to save.

diagnosticstr, optional

High-level diagnostic category (default is “ocean_circulation”).

diagnostic_productstr, optional

Specific diagnostic product name (default is “stratification”).

regionstr, optional

Region name to include in metadata or filename.

outputdirstr, optional

Directory where the NetCDF file will be saved (default is current directory).

rebuildbool, optional

If True, force rebuild of NetCDF file even if it exists (default is True).

class aqua.diagnostics.Timeseries(diagnostic_name: str = 'timeseries', catalog: str = None, model: str = None, exp: str = None, source: str = None, regrid: str = None, startdate: str = None, enddate: str = None, std_startdate: str = None, std_enddate: str = None, region: str = None, lon_limits: list = None, lat_limits: list = None, loglevel: str = 'WARNING')

Bases: BaseMixin

Timeseries class for retrieve and netcdf saving of a single experiment

Initialize the Timeseries class.

Parameters:
  • diagnostic_name (str) – The name of the diagnostic. Used for logger and filenames. Default is ‘timeseries’.

  • catalog (str) – The catalog to be used. If None, the catalog will be determined by the Reader.

  • model (str) – The model to be used.

  • exp (str) – The experiment to be used.

  • source (str) – The source to be used.

  • regrid (str) – The target grid to be used for regridding. If None, no regridding will be done.

  • startdate (str) – The start date of the data to be retrieved. If None, all available data will be retrieved.

  • enddate (str) – The end date of the data to be retrieved. If None, all available data will be retrieved.

  • std_startdate (str) – The start date of the standard period.

  • std_enddate (str) – The end date of the standard period.

  • region (str) – The region to select. This will define the lon and lat limits.

  • lon_limits (list) – The longitude limits to be used. Overriden by region.

  • lat_limits (list) – The latitude limits to be used. Overriden by region.

  • loglevel (str) – The log level to be used. Default is ‘WARNING’.

MINIMUM_MONTHS_REQUIRED = 2
compute(freq: str, exclude_incomplete: bool = True, center_time: bool = True, box_brd: bool = True)

Compute the mean of the data. Support for hourly, daily, monthly and annual means.

Parameters:
  • freq (str) – The frequency to be used for the resampling.

  • exclude_incomplete (bool) – If True, exclude incomplete periods.

  • center_time (bool) – If True, the time will be centered.

  • box_brd (bool,opt) – choose if coordinates are comprised or not in area selection. Default is True

run(var: str, formula: bool = False, long_name: str = None, units: str = None, short_name: str = None, std: bool = False, freq: list = ['monthly', 'annual'], exclude_incomplete: bool = True, center_time: bool = True, box_brd: bool = True, outputdir: str = './', rebuild: bool = True, reader_kwargs: dict = {}, create_catalog_entry: bool = False)

Run all the steps necessary for the computation of the Timeseries. Save the results to netcdf files. Can evaluate different frequencies.

Parameters:
  • var (str) – The variable to be retrieved.

  • formula (bool) – If True, the variable is a formula.

  • long_name (str) – The long name of the variable, if different from the variable name.

  • units (str) – The units of the variable, if different from the original units.

  • short_name (str) – The short name of the variable, if different from the variable name.

  • std (bool) – If True, compute the standard deviation. Default is False.

  • freq (list) – The frequencies to be used for the computation. Available options are ‘hourly’, ‘daily’, ‘monthly’ and ‘annual’. Default is [‘monthly’, ‘annual’].

  • exclude_incomplete (bool) – If True, exclude incomplete periods.

  • center_time (bool) – If True, the time will be centered.

  • box_brd (bool) – choose if coordinates are comprised or not in area selection.

  • outputdir (str) – The directory to save the data.

  • rebuild (bool) – If True, rebuild the data from the original files.

  • reader_kwargs (dict) – Additional keyword arguments for the Reader. Default is an empty dictionary.

  • create_catalog_entry (bool) – If True, create a catalog entry for the data. Default is False.

class aqua.diagnostics.Trends(model: str, exp: str, source: str, catalog: str = None, regrid: str = None, startdate: str = None, enddate: str = None, diagnostic_name: str = 'trends', vert_coord: str = 'level', loglevel: str = 'WARNING')

Bases: Diagnostic

Class to compute trends over time.

Initialize the Trends class.

Parameters:
  • model (str) – Climate model name.

  • exp (str) – Experiment name.

  • source (str) – Data source name.

  • catalog (str, optional) – Path to the data catalog.

  • regrid (str, optional) – Regridding method.

  • startdate (str, optional) – Start date for data selection.

  • enddate (str, optional) – End date for data selection.

  • diagnostic_name (str, optional) – Name of the diagnostic for filenames. Defaults to “trends”.

  • vert_coord (str, optional) – Name of the vertical dimension coordinate. Defaults to DEFAULT_OCEAN_VERT_COORD.

  • loglevel (str, optional) – Logging level. Default is “WARNING”.

MINIMUM_MONTHS_REQUIRED = 12
adjust_trend_for_time_frequency(trend, y_array)

Adjust trend values based on the time frequency of the data.

Parameters:
  • trend (xr.DataArray) – Trend values to adjust.

  • y_array (xr.DataArray) – Original data array with time coordinate.

Returns:

Adjusted trend values.

Return type:

xr.DataArray

compute_trend(data: DataArray | Dataset)

Compute linear trend coefficients over time.

Parameters:

data (xr.DataArray or xr.Dataset) – Input data with a time dimension.

Returns:

Trend coefficients adjusted for time frequency.

Return type:

xr.DataArray or xr.Dataset

run(outputdir: str = '.', rebuild: bool = True, region: str = None, var: list = ['thetao', 'so'], dim_mean: type = None, reader_kwargs: dict = {})

Run the trend analysis workflow.

Parameters:
  • outputdir (str, optional) – Directory to save output files. Default is current directory.

  • rebuild (bool, optional) – If True, rebuild existing files. Default is True.

  • region (str, optional) – Geographical region for analysis.

  • var (list, optional) – List of variable names to analyze. Default is [‘thetao’, ‘so’].

  • dim_mean (str or list, optional) – Dimension(s) over which to compute the mean. Default is None.

  • reader_kwargs (dict, optional) – Additional keyword arguments for the data reader. Default is {}.

save_netcdf(diagnostic_product: str = 'trend', outputdir: str = '.', rebuild: bool = True)

Save trend coefficients to a NetCDF file.

Parameters:
  • diagnostic (str, optional) – Diagnostic name for filenames. Default is “trends”.

  • diagnostic_product (str, optional) – Product type for filenames. Default is “spatial_trend”.

  • region (str, optional) – Geographical region for analysis.

  • outputdir (str, optional) – Directory to save output files. Default is current directory.

  • rebuild (bool, optional) – If True, rebuild existing files. Default is True.

select_region(data, region=None, drop=True, dim_mean=None)

Select a region and optionally compute mean over specified dimensions.

Parameters:
  • data (xr.Dataset) – Input dataset.

  • region (str, optional) – Geographical region to select.

  • drop (bool, optional) – Whether to drop coordinates outside the region. Default is True.

  • dim_mean (str or list, optional) – Dimension(s) over which to compute the mean.

Returns:

(data, region) - Processed data and region name.

Return type:

tuple

aqua.diagnostics.extract_realizations(catalog, model, exp, source)

Extract the realizations available for a given catalog, model, exp and source.

Parameters:
  • catalog (str) – Intake catalog name.

  • model (str) – Model name.

  • exp (str) – Experiment name.

  • source (str) – Source name.

Returns:

List of available realizations.

Return type:

list

aqua.diagnostics.merge_from_data_files(variable: str = None, ens_dim: str = 'ensemble', model_names: list[str] = None, data_path_list: list[str] = None, startdate: str = None, enddate: str = None, loglevel: str = 'WARNING')

Merge ensemble NetCDF files along the ensemble dimension with optional temporal selection.

This function loads NetCDF files from the given paths, assigns an ensemble dimension, optionally subsets the data by start and end dates, and concatenates the datasets into a single xarray.Dataset along ens_dim. Model names are assigned to each ensemble member for metadata tracking.

Parameters:
  • variable (str, optional) – Name of the variable to merge. Defaults to None.

  • ens_dim (str, optional) – Name of the ensemble dimension. Defaults to “ensemble”.

  • model_names (list[str], optional) – List of model names. Must correspond to the sequence of files in data_path_list. If multiple realizations exist for a model, repeat model names accordingly.

  • data_path_list (list[str], optional) – List of file paths to NetCDF datasets. Mandatory.

  • startdate (str, optional) – Start date for temporal subsetting (YYYY-MM-DD). Defaults to None.

  • enddate (str, optional) – End date for temporal subsetting (YYYY-MM-DD). Defaults to None.

  • loglevel (str, optional) – Logging level. Defaults to “WARNING”.

Returns:

Merged dataset concatenated along ens_dim, with model names in metadata. If the dataset has a time dimension, the data is sliced according to startdate and enddate.

Return type:

xarray.Dataset

aqua.diagnostics.reader_retrieve_and_merge(variable: str = None, ens_dim: str = 'ensemble', catalog_list: list[str] = None, model_list: list[str] = None, exp_list: list[str] = None, source_list: list[str] = None, reader_kwargs: dict[str, list[str]] = None, realization: dict[str, list[str]] = None, region: str = None, lon_limits: float = None, lat_limits: float = None, startdate: str = None, enddate: str = None, regrid: str = None, areas: bool = False, fix: bool = False, loglevel: str = 'WARNING')

Retrieve, merge, and slice datasets from multiple models, experiments, and sources.

This function uses the AQUA Reader class to load data for a specified variable from multiple catalogs, models, experiments, and sources. Individual realizations are loaded, optionally subset by spatial (lon/lat) or temporal (start/end date) constraints, and concatenated along a specified ensemble dimension. The final merged dataset contains all requested ensemble members with appropriate metadata.

Parameters:
  • variable (str, optional) – Name of the variable to retrieve. Defaults to None.

  • ens_dim (str, optional) – Name of the ensemble dimension for concatenation. Defaults to “ensemble”.

  • catalog_list (list[str], optional) – List of AQUA catalogs to retrieve data from. Defaults to None.

  • model_list (list[str], optional) – List of models corresponding to catalogs and experiments. Defaults to None.

  • exp_list (list[str], optional) – List of experiments corresponding to models and sources. Defaults to None.

  • source_list (list[str], optional) – List of sources corresponding to models and experiments. Defaults to None.

  • realization (dict[str, list[str]], optional) – Dictionary specifying realizations per model. Defaults to None.

  • region (str, optional) – Region for zonal or spatial selections. Defaults to None.

  • lon_limits (float, optional) – Longitude limits for spatial subsetting. Defaults to None.

  • lat_limits (float, optional) – Latitude limits for spatial subsetting. Defaults to None.

  • startdate (str, optional) – Start date for temporal subsetting. Defaults to None.

  • enddate (str, optional) – End date for temporal subsetting. Defaults to None.

  • regrid (str, optional) – Grid to reproject data onto. Defaults to None.

  • areas (bool, optional) – Whether to calculate area-weighted values. Defaults to False.

  • fix (bool, optional) – Apply data fixes if necessary. Defaults to False.

  • loglevel (str, optional) – Logging level for messages. Defaults to “WARNING”.

Returns:

Merged dataset containing all requested ensemble members, concatenated along ens_dim with metadata including description, variable, and ensemble member labels.

Return type:

xarray.Dataset

Raises:

RuntimeError – If no datasets are successfully retrieved from AQUA Reader.

Notes

  • If all catalog_list, model_list, exp_list, and source_list are None or empty, the function returns None.

  • Handles missing or default realizations by using [“r1”].

  • Automatically frees memory after processing individual datasets.