ecmwf_models.era5 package

Submodules

ecmwf_models.era5.download module

Module to download ERA5 from terminal in netcdf and grib format.

class ecmwf_models.era5.download.CDSStatusTracker[source]

Bases: object

Track the status of the CDS download by using the CDS callback functions

handle_error_function(*args, **kwargs)[source]
statuscode_error = -1
statuscode_ok = 0
statuscode_unavailable = 10
ecmwf_models.era5.download.default_variables(product='era5')[source]

These variables are being downloaded, when None are passed by the user

Parameters

product (str, optional (default: 'era5')) – Name of the era5 product to read the default variables for. Either ‘era5’ or ‘era5-land’.

ecmwf_models.era5.download.download_and_move(target_path, startdate, enddate, product='era5', variables=None, keep_original=False, h_steps=(0, 6, 12, 18), grb=False, dry_run=False, grid=None, remap_method='bil', cds_kwds={}, stepsize='month', n_max_request=1000) int[source]

Downloads the data from the ECMWF servers and moves them to the target path. This is done in 30 day increments between start and end date.

The files are then extracted into separate grib files per parameter and stored in yearly folders under the target_path.

Parameters
  • target_path (str) – Path where the files are stored to

  • startdate (datetime) – first date to download

  • enddate (datetime) – last date to download

  • product (str, optional (default: ERA5)) – Either ERA5 or ERA5Land

  • variables (list, optional (default: None)) – Name of variables to download

  • keep_original (bool (default: False)) – keep the original downloaded data

  • h_steps (list or tuple) – List of full hours to download data at the selected dates e.g [0, 12]

  • grb (bool, optional (default: False)) – Download data as grib files

  • dry_run (bool) – Do not download anything, this is just used for testing the functions

  • grid (dict, optional (default: None)) –

    A grid on which to remap the data using CDO. This must be a dictionary using CDO’s grid description format, e.g.:

    grid = {
        "gridtype": "lonlat",
        "xsize": 720,
        "ysize": 360,
        "xfirst": -179.75,
        "yfirst": 89.75,
        "xinc": 0.5,
        "yinc": -0.5,
    }
    

    Default is to use no regridding.

  • remap_method (str, optional (dafault: 'bil')) – Method to be used for regridding. Available methods are: - “bil”: bilinear (default) - “bic”: bicubic - “nn”: nearest neighbour - “dis”: distance weighted - “con”: 1st order conservative remapping - “con2”: 2nd order conservative remapping - “laf”: largest area fraction remapping

  • cds_kwds (dict, optional (default: {})) – Additional arguments to be passed to the CDS API retrieve request.

  • n_max_request (int, optional (default: 1000)) – Maximum size that a request can have to be processed by CDS. At the moment of writing this is 1000 (N_timstamps * N_variables in a request) but as this is a server side settings, it can change.

Returns

status_code – 0 : Downloaded data ok -1 : Error -10 : No data available for requested time period

Return type

int

ecmwf_models.era5.download.download_era5(c, years, months, days, h_steps, variables, target, grb=False, product='era5', dry_run=False, cds_kwds={})[source]

Download era5 reanalysis data for single levels of a defined time span

Parameters
  • c (cdsapi.Client) – Client to pass the request to

  • years (list) – Years for which data is downloaded ,e.g. [2017, 2018]

  • months (list) – Months for which data is downloaded, e.g. [4, 8, 12]

  • days (list) – Days for which data is downloaded (range(31)=All days) e.g. [10, 20, 31]

  • h_steps (list) – List of full hours to download data at the selected dates e.g [0, 12]

  • variables (list, optional (default: None)) – List of variables to pass to the client, if None are passed, the default variables will be downloaded.

  • target (str) – File name, where the data is stored.

  • geb (bool, optional (default: False)) – Download data in grib format

  • product (str) – ERA5 data product to download, either era5 or era5-land

  • dry_run (bool, optional (default: False)) – Do not download anything, this is just used for testing the functionality

  • cds_kwds (dict, optional) – Additional arguments to be passed to the CDS API retrieve request.

Returns

success – Return True after downloading finished

Return type

bool

ecmwf_models.era5.download.main(args)[source]
ecmwf_models.era5.download.parse_args(args)[source]

Parse command line parameters for recursive download

Parameters

args (list) – Command line parameters as list of strings

Returns

clparams – Parsed command line parameters

Return type

argparse.Namespace

ecmwf_models.era5.download.run()[source]
ecmwf_models.era5.download.split_array(array, chunk_size)[source]

Split an array into chunks of a given size.

Parameters
  • array (array-like) – Array to split into chunks

  • chunk_size (int) – Size of each chunk

Returns

chunks – List of chunks

Return type

list

ecmwf_models.era5.download.split_chunk(timestamps, n_vars, n_hsteps, max_req_size=1000, reduce=False, daily_request=False)[source]

Split the passed time stamps into chunks for a valid request. One chunk can at most hold data for one month or one day, but cannot be larger than the maximum request size.

Parameters
  • timestamps (pd.DatetimeIndex) – List of daily timestamps to split into chunks

  • n_vars (int) – Number of variables in each request.

  • max_req_size (int, optional (default: 1000)) – Maximum size of a request that the CDS API can handle

  • reduce (bool, optional (default: False)) – Return only the start and end of each subperiod instead of all time stamps.

  • daily_request (bool, optional (default: False)) – Only submit daily requests, otherwise monthly requests are allowed (if the max_req_size is not reached).

Returns

chunks – List of start and end dates that contain a chunk that the API can handle.

Return type

list

ecmwf_models.era5.interface module

This module contains ERA5/ERA5-Land specific child classes of the netcdf and grib base classes, that are used for reading all ecmwf products.

class ecmwf_models.era5.interface.ERA5GrbDs(root_path: str, parameter: Collection[str] = ('swvl1', 'swvl2'), h_steps: Collection[int] = (0, 6, 12, 18), product: Literal['era5', 'era5-land'] = 'era5', subgrid: Optional[CellGrid] = None, mask_seapoints: Optional[bool] = False, array_1D: Optional[bool] = False)[source]

Bases: ERAGrbDs

class ecmwf_models.era5.interface.ERA5GrbImg(filename: str, parameter: Optional[Collection[str]] = ('swvl1', 'swvl2'), subgrid: Optional[CellGrid] = None, mask_seapoints: Optional[bool] = False, array_1D=False)[source]

Bases: ERAGrbImg

class ecmwf_models.era5.interface.ERA5NcDs(root_path: str, parameter: Collection[str] = ('swvl1', 'swvl2'), product: Literal['era5', 'era5-land'] = 'era5', h_steps: Collection[int] = (0, 6, 12, 18), subgrid: Optional[CellGrid] = None, mask_seapoints: Optional[bool] = False, array_1D: Optional[bool] = False)[source]

Bases: ERANcDs

Reader for a stack of ERA5 netcdf image files.

Parameters
  • root_path (str) – Path to the image files to read.

  • parameter (list or str, optional (default: ('swvl1', 'swvl2'))) – Name of parameters to read from the image file.

  • product (str, optional (default: 'era5')) – What era5 product, either era5 or era5-land.

  • h_steps (list, optional (default: (0,6,12,18))) – List of full hours to read images for.

  • subgrid (pygeogrids.CellGrid, optional (default: None)) – Read only data for points of this grid and not global values.

  • mask_seapoints (bool, optional (default: False)) – Read the land-sea mask to mask points over water and set them to nan. This option needs the ‘lsm’ parameter to be in the file!

  • array_1D (bool, optional (default: False)) – Read data as list, instead of 2D array, used for reshuffling.

class ecmwf_models.era5.interface.ERA5NcImg(filename: str, parameter: Optional[Collection[str]] = ('swvl1', 'swvl2'), product: Literal['era5', 'era5-land'] = 'era5', subgrid: Optional[CellGrid] = None, mask_seapoints: Optional[bool] = False, array_1D: Optional[bool] = False)[source]

Bases: ERANcImg

ecmwf_models.era5.reshuffle module

Module for a command line interface to convert the ERA Interim data into a time series format using the repurpose package

ecmwf_models.era5.reshuffle.main(args)[source]
ecmwf_models.era5.reshuffle.parse_args(args)[source]

Parse command line parameters for conversion from image to time series.

Parameters

args (list) – command line parameters as list of strings

Returns

args – Parsed command line parameters

Return type

argparse.Namespace

ecmwf_models.era5.reshuffle.reshuffle(input_root, outputpath, startdate, enddate, variables, product=None, bbox=None, h_steps=(0, 6, 12, 18), land_points=False, imgbuffer=50)[source]

Reshuffle method applied to ERA images for conversion into netcdf time series format.

Parameters
  • input_root (str) – Input path where ERA image data was downloaded to.

  • outputpath (str) – Output path, where the reshuffled netcdf time series are stored.

  • startdate (datetime) – Start date, from which images are read and time series are generated.

  • enddate (datetime) – End date, from which images are read and time series are generated.

  • variables (tuple or list or str) – Variables to read from the passed images and convert into time series format.

  • product (str, optional (default: None)) – Either era5 or era5-land, if None is passed we guess the product from the downloaded image files.

  • bbox (tuple optional (default: None)) – (min_lon, min_lat, max_lon, max_lat) - wgs84. To load only a subset of the global grid / file.

  • h_steps (list or tuple, optional (default: (0, 6, 12, 18))) – Hours at which images are read for each day and used for reshuffling, therefore this defines the sub-daily temporal resolution of the time series that are generated.

  • land_points (bool, optional (default: False)) – Reshuffle only land points. Uses the ERA5 land mask to create a land grid. The land grid is fixed to 0.25*0.25 or 0.1*0.1 deg for now.

  • imgbuffer (int, optional (default: 200)) – How many images to read at once before writing time series. This number affects how many images are stored in memory and should be chosen according to the available amount of memory and the size of a single image.

ecmwf_models.era5.reshuffle.run()[source]

Module contents